Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scientific floats #398

Merged
merged 20 commits into from
Feb 11, 2025
Merged

Scientific floats #398

merged 20 commits into from
Feb 11, 2025

Conversation

Pat-Lafon
Copy link
Contributor

Fixes #388

I've bundled in two additional fixes so far. The rust bril2json implementation treats from as a keyword and wasn't accepting it also as an identifier. I have also moved the random_walk benchmark to mixed so that brilift can ignore it as it uses both the mem and char extensions.

My attempt to address #388 has all floating points with abs(exponent) >= 10 use exponential notation. This is the minimum I would set it to, to avoid language differences in whether to require 2 digits after the e like c's printf does (like 1e09).

Since Rust was the main outlier in not extending positive exponents with a +, it seems reasonable to also expect that for numbers printed as an exponential (like 1.1e+11).

Given the variation in how significant digits with an exponent are treated(and some odd floats when the number's exponent is smaller than the number of significant digits), it seems more uniform to not require a specific number of significant digits when using e(which ends up being equivalent to %g in c's printf).

I also slightly touched up brili's handling of -0.0 for which Javascript is the only language of those I tried that wanted to print positive zero.

I would like to put up this pr for discussion, there are different choices that can be made here. For example, maybe printing floats should just be %g compatible and require that implementations pull in a c-printf compatible formatter. The languages I've looked at are Javascript, Rust, C, and Ocaml.

I have not updated the documentation yet.

@Pat-Lafon
Copy link
Contributor Author

So correction, seems like %g will sometimes cap the number of significant digits at 6... so the number of significant digits in the exponent case should probably be fixed instead of allowed to be variable.

Copy link
Owner

@sampsyo sampsyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome; thank you for getting started. Overall, I like your idea here: namely, pick a value threshold and switch between scientific notation and, uh, "non-scientific" notation. But in either case, keep the number of decimal points the same.

You seem to be right that C's %g specifier is similar in spirit (try to switch between %f-like and %g-like formatting) but not exactly the same. So instead of using %g, maybe we should just use %e directly for numbers that exceed the threshold?

brili.ts Outdated Show resolved Hide resolved
Copy link
Owner

@sampsyo sampsyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome; thanks for adding to the docs and stuff. Maybe this is a silly question and I should already know the answer, but is there a straightforward reason to switch to 6 digits of precision instead of 17 in the "scientific notation range"? (Recall that 17 was chosen because that's enough significant digits to get an exact round-trip, and I think that's independent of the exponent.)

brili.ts Outdated Show resolved Hide resolved
@Pat-Lafon
Copy link
Contributor Author

Awesome; thanks for adding to the docs and stuff. Maybe this is a silly question and I should already know the answer, but is there a straightforward reason to switch to 6 digits of precision instead of 17 in the "scientific notation range"? (Recall that 17 was chosen because that's enough significant digits to get an exact round-trip, and I think that's independent of the exponent.)

Ah, I did not remember that was the reason for choosing 17 significant digits. Given that, I was seeing if I could avoid floating point weirdness with larger significant digits like 1E-21 being printed as .99999999999999908e-22 instead of 1.000000e-21. Also taking into account that %g seems to use up to 6 significant digits in some of these cases.

I've switched this back so there is a consistent printing of floats at 17 significant digits.

Copy link
Owner

@sampsyo sampsyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, perfect. And good point about round-tripping the decimal value 1e-12; that is annoying, but it seems like it may be worth compromising on to ensure that the underlying double values are unambiguously printed.

This all looks great!!! Thanks again! I made the docs a little bit simpler by combining the C printf commentary.

@sampsyo sampsyo merged commit d41519b into sampsyo:main Feb 11, 2025
19 checks passed
@Pat-Lafon Pat-Lafon deleted the scientific_floats branch February 11, 2025 22:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[brili/brilirs] Clarifying printing large floats
2 participants