Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unit conversion emits a surprisingly high amount of instructions #323

Open
chiphogg opened this issue Nov 5, 2023 · 0 comments
Open

unit conversion emits a surprisingly high amount of instructions #323

chiphogg opened this issue Nov 5, 2023 · 0 comments

Comments

@chiphogg
Copy link

chiphogg commented Nov 5, 2023

Consider this round-trip unit conversion, using both the nholthaus library and Au.

double nholthaus_round_trip(double x) {
    return ::units::angle::degree_t{::units::angle::radian_t{::units::angle::degree_t{x}}}.value();
}

double au_round_trip(double x) {
    return au::degrees(x).as(au::radians).in(au::degrees);
}

I expected these to emit basically equivalent code, but was surprised to see a huge difference in the amount of instructions. This translates into an actual runtime performance penalty, which is likely avoidable. (That said: I highly doubt that unit conversions should ever occur in the "hot loop" of a well designed program, so this is probably not a meaningful performance penalty.)

Here's a godbolt link using clang 16.0.0.

For nholthaus, we see two things. First, that it's multiplying and dividing by pi and 180, instead of combining them into a single factor pi / 180 at compile time. Second, that it emits a surprisingly large number of instructions that I can't explain (I'm not well versed in assembly):

image

For Au, we can see that the factors are combined into one (we see ~57.3, and its inverse). And we emit only two instructions:

image

Here's the godbolt link for gcc 13.2. I didn't use this one first because the other one actually generates comments to show you what values are being used, which is nice.

Anyway, we can see the nholthaus code looks much more reasonable for gcc than for clang, although it still emits more instructions than Au. Here it is:

image

And here's what we get for Au (still just two instructions):

image


What's the upshot? I guess it would be nice to consider combining the conversion factors into a single value, computed at compile time. This doc on applying conversion factors may be useful reading here.

I'm also curious why clang emits so much more code than gcc does, but I assume if we switched to a single conversion factor then this would all go away and the point would be moot. (Although it'd be interesting if we found that it didn't!)


Please include the following information in your issue:

  1. Which version of units you are using

The current master.

  1. Which compiler exhibited the problem (including compiler version)

clang 16.0.0 and gcc 13.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant