-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for fast-math optimizations #531
Add support for fast-math optimizations #531
Conversation
This is pretty much ready to go. The one remaining question is what default to set for (particularly GPU) fast-math flags. Originally, Terra used NVVM for CUDA code generation. (We still support this path with LLVM 3.8.) It turns out that NVVM performs the equivalent of LLVM's With the migration to NVPTX, this goes away. Fast-math flags are encoded in each floating-point LLVM instruction, and NVPTX actually respects these (as opposed to NVVM which applies Since it's easy to turn these flags on now, I think it's reasonable to leave them off by default, but again this does result in a performance regression for anyone who has been sticking it out with LLVM 3.8 and doesn't pay attention to the new arguments to Edit: to be clear, the option that has been implemented in this PR is to keep fast-math flags disabled by default. If anyone has opinions on this, let me know. |
CI failure looks like a Nix issue, and my last (nearly identical) commit passed, so I'm going to go ahead and merge this. |
Adds support for fast-math optimizations in
terralib.saveobj
.Fixes #530
TODO: