Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support fast math flags #530

Closed
Tracked by #471
elliottslaughter opened this issue Apr 5, 2022 · 1 comment · Fixed by #531
Closed
Tracked by #471

Support fast math flags #530

elliottslaughter opened this issue Apr 5, 2022 · 1 comment · Fixed by #531

Comments

@elliottslaughter
Copy link
Member

elliottslaughter commented Apr 5, 2022

See #529 (comment) for why this matters. This is the last major blocker (as far as I can tell) on #471.

LLVM fast math flags are documented here. The main question is how to expose them.

Two options that come to mind are adding a function-level attribute exposed by a setter. E.g.:

my_func:setfastmath(true) -- set all flags
my_func:setfastmath("contract") -- set one flag
my_func:setfastmath({"contract", "nnan"}) -- set two flags

This is potentially obnoxious because you need to remember to set it on any functions you care about.

A more whole-program approach would be to set this through terralib.saveobj, probably hijacking the optimize parameter:

terralib.saveobj(filename, filetype, functable, arguments, target, {fastmath = true})

A final option would be to add a flag like:

--fast-math <options>

Which would then be truly whole-program.

@elliottslaughter
Copy link
Member Author

#531 has landed, which adds the saveobj form of this option. I think that's the best approach, but we could potentially add others as well in the future, as they don't need to be mutually exclusive. I've confirmed we get the expended performance from the new version in the apps I've tested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant