Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast math discussion needs more nuance #275

Open
HadrienG2 opened this issue Mar 23, 2024 · 0 comments
Open

Fast math discussion needs more nuance #275

HadrienG2 opened this issue Mar 23, 2024 · 0 comments

Comments

@HadrienG2
Copy link

HadrienG2 commented Mar 23, 2024

First of all, congratulations for writing the most accurate micro-optimization textbook I have read so far.

There are, however, still a couple of important shortcomings that prevent me from recommending to my computational scientist community right now. One of them is your discussion of fast-math, which present it as an unilateral win that costs just a few bits of precision. This description lacks the level of nuance that this topic deserves, as explained in https://simonbyrne.github.io/notes/fastmath/ .

A good discussion of fast-math should cover the following points:

  • fast-math turns many important parts of the IEEE-754 specification into undefined behavior (basically every special number: -0, NaN, +/-inf... is assumed not to be received as input or emitted as output at runtime). As a result, it makes it dangerously easy to write programs that invoke undefined behavior, when all they seem to do is to perform basic FP operations. This can, in turn, result in arbitrary badness like all UB does.
  • fast-math makes the floating-point output of a program depend on hardware, compilers and even on compiler versions. While exact reproducibility is a very costly property to enforce on modern hardware, it is also makes programs a lot easier to test. A program that uses fast-math in production should have tests that assert the correctness of results using more mathematically general, and thus more complex properties.
  • While fast-math only costs a few bits of precision when performing small amounts of simple arithmetic, it can and will break fancier numerical algorithms, such as transcendental functions and statistics packages, not just in your code but in third-party libraries that you use. This is important because many common math manipulations in the set of reals (e.g. polynomial evaluations, ratios of small numbers...) are numerically unstable when performed in the set of floating-point numbers, and getting them to produce even one bit of correctness there requires special precautions. These precautions are exactly the kind of seemingly unnecessary complex code that fast-maths optimizes out.

Because of these points, I find it safer only use fast-math as a guide to manual optimization during development, rather than as a production binary feature, and would encourage you to advise the same in your book:

  • Periodically turn on fast-math.
  • Locate any resulting program speedup using a profiler.
  • Study the assembler before and after to see what fast-math did.
  • Make sure this transformation is valid for your algorithm (ideally, you would have tests for that).
  • If so, apply the corresponding transformation to the C code so that you get the same benefit without fast-math.
  • Turn off fast-math once it does not make any meaningful difference anymore.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant