-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Numbers #14
Comments
My one bikeshed hill i'll always die on is that comparisons with fixed epsilons really need to be done carefully and with a /lot/ more thought that I think we want to give it. We need to consider the number's size , so that we can understand the precision in the representation -- and even then, if the number was multiplied and divided by large numbers it may still have a loss of precision just due to that round-trip, even if it's in some fixed range. e.g.,: the following numbers have the same bitwise representation (sign, mantissa and exponent)
(or in rust) And even when you need to turn it back into an
I'd much prefer (personally, due to scar tissue and heartache) rely on some sort of floating point equity function that accepts an epsilon that the user has to provide with knowledge of the number's history/possible values. For the example above the correct eps is |
I'm generally with you, @nrc 👍 I agree that for KCL's primary uses, decimals are the right choice. Obviously, there will always be exceptions. I think it would be okay if that were more verbose since it's a lot less common. Integer CoercionIf I don't bring it up, someone will: Would we want to do automatic coercion to int for things that require it, like array indexing? A rough edge now is that it's unclear to users when exactly they need to use EpsilonsThe problem with epsilons is that even as a user who's aware of them, it's not always clear what a good epsilon value should be. If I'm writing a library that accepts input, a good epsilon might need to be computed from the input or even a parameter. To do it right, you can't just have a constant or several constants. Ideally, the tool/language would guide people to do the right thing. Could But it seems unsatisfactory because it's error-prone and makes everything more verbose and annoying. I'm not sure what's better though. I imagine that the "right" way is similar to tracking significant figures or propagation of uncertainty. I'm not aware of a programming language that does this, but it sounds like yet another difficult problem that affects everything that we can't tackle in the short-term. Long TermPrecisionYou might want to check out @lf94's trick question in Slack:
I'm going to spoil this for you. The highest numeric you can represent and assign to a variable in KCL is the highest finite Rust @lf94's point was that KCL doesn't exist in a vacuum. If the engine has limits, maybe those limits should intentionally be part of KCL. Then again, it might be nice if you could do perfect arbitrary precision math all in KCL, and it only loses precision when you cross the boundary to the engine and get its output. But this difference will inevitably be exposed to users, which seems like a rough edge that will be a continual cause of questions. In the extreme, I could imagine someone wanting to re-implement complex math in KCL so that it doesn't lose precision. |
Lol, that I kept using delta instead of epsilon :-) Anyway, one thing I forgot to mention is that there is an absolute ton of prior art here. It's an area I'm not super familiar with though. I think that there are probably some good solutions we can pick hopefully without much modification rather than working things out from scratch. @jon: re integer coercion: yep, this is a big component of my first point. I want re epsilons: (and @paultag) I'd love to find out more about how this is handled in other systems and PLs. The only options I'm aware of are using fixed SF (sub-optimal), user-specified tolerance (far from ideal in terms of UX, but maybe more acceptable in a CAD lang than a general purpose PL, and using rational arithmetic so tolerances aren't needed (imperfect and hard to implement well). Possibly some combination will work or there are probably more sophisticated solutions out there. re precision: I think this is not all long-term and we should actually think about the interaction with the engine up front (even if we don't implement anything, I think we must plan this a bit). My thoughts are that there is arithmetic which is purely in KCL and we need good answers there independent of the engine limitations, but we should make sure that that does align with what happens in the engine. I'm still getting my head around exactly what we should be doing locally vs in the engine... |
Reading the above comments, it's not clear to me when you are referring to "decimal" vs. "float". Just in case those are referring to the same thing, I would like to point out that some languages support "decimal float" which is represented by a base-10 mantissa instead of base-2. GNU C, for example has some support for these types: _Decimal32, _Decimal64, and _Decimal128. These decimal types seem well suited for CAD use, but I don't know if Rust supports them. Some CPUs and GPUs do have support for these types. The downside is that ISO/IEC decimal types seem to have been in the draft stage for at least 10 years. For reference: https://gcc.gnu.org/onlinedocs/gcc/Decimal-Float.html |
I tried to be precise about using decimal to mean a number including a part which is smaller than one (in contrast to integer, but also in contrast to a rational where the 'smaller than one part' is implicit in the fraction representation) and float to mean the implementation mechanism of floating point numbers. I don't think it's necessary for Rust to support any approach we choose, we can always implement it in user space (though if we're using WASM, that limits what we can do efficiently). I believe that some 'big decimal'/arbitrary arithmetic libs uise decimal floating point and it's definitely an approach we should consider. |
I was assuming there would be performance issues if you rely on a userspace library for the arbitrary precision arithmetic, which might not be usable in a GPU, but I don't know much about the processing needed in the engine. Thanks for considering my comment. |
It's complicated because of the server/client split and because the KCL interpreter is in WASM, but in general for our work I think a userspace library should be fine, they can be well-optimised and it's unlikely to be the performance bottleneck. The hard bit is doing this in KCL and the engine and making sure we get the same behaviour |
TL;dr: keep f64 for numbers (no internal or user-facing int type) for now, implicit rounding to ints as required with 0.1 epsilon (not for comparisons), remove I don't have an opinion about the long term yet, I see a few reasonable options for our number representation: floats, fixed point decimals, or arbitrary precision floating or fixed point numbers (possibly even rationals, though handling pi, sin, cos, etc. in that case doesn't work). I don't know about comparisons. We should look at what other CAD systems and other PLs do. I'm unsure about a user-facing Proposal for launchFor now, my proposal is that:
RationaleMy guiding principle is that users should not have to worry about the implementation of numbers. The current situation which distinguishes between Epsilon size for roundingI want to make it as easy as possible to use numbers, including the output of calculations for array access, etc. However, I think it is worth requiring the user to be explicit about rounding where rounding is necessary. It seems to me that how numbers are rounded is an important consideration when doing CAD and that often if the maths does not result in a round number, that might indicate a bug in the program (e.g., if the user is computing how many screw holes to fit in a piece, they might want to always round up (for safety) or always round down (for aesthetics) or might expect the maths to always result in a round number). I think that choosing a fixed epsilon for rounding is much easier than picking one for comparisons (and there is no need for these epsilon values to be related in any way). I think 0.1 is a good compromise - it is large enough that it will address nearly any floating point errors (at least for reasonable-sized numbers likely to be used in KCL) and small enough that it will catch many bugs both in literals (why I chose < 0.1 rather than <= 0.1) and computation results. Optimisation of integer literalsWhere we write something like Future compatI'm pretty sure we don't want to force users to explicitly convert from numbers to ints. The possible future compat issues I can think of are:
I think that 1, 2, 5, and 6 will technically be breaking changes but they'll be small, technical ones which will have very little impact (unless we pick a very bad new value for epsilon). I think 3 will not have much effect if we allow implicit conversions from number to int and int to number (which I think we should, with some dynamic checks). I don't think 4 will cause any issues. I think if we support 7, then it will be in addition to implicit rounding with a fixed epsilon (which might be a different epsilon, but as for 1, I don't think it will cause real issues). 8 could cause back-compat issues, but it should only make runtime bugs into checked errors, which I think is good. |
Seems good to me! I think implicit conversions make sense for KCL given that it's almost always dealing with fractional numbers, and that whole numbers are really just used as counts for functions (e.g. replicate this 4 times) or indexes into arrays. |
It sounds good to me overall. I had a few thoughts, but I'm willing to follow your lead, as I don't have any concrete suggestions.
Questions
|
Yeah, I think there is a definitely a trade-off and there is a downside, but I think it is worth it. I'd be very open to use a different epsilon. I just pulled 0.1 out of thin air. We should do a little experiment to see whether 0.01 or 0.001 (or other numbers) would work and give better results.
Agree, I'm not worried about the current proposal - a few fp and int ops and an easily predictable branch should have very little impact on performance. I also hope that anywhere we do a lot of array accesses or similar, we can easily hoist the conversion outside of the loop in the Rust code.
We'd want to use 0.99 rather than 1 or something (lol, even the epsilon needs an epsilon). I don't think we need anything fancier than that. But generally the whole point of this is to hide the users from the horrors of floating point.
I think we try to error out as early as possible, I'd really like to avoid surfacing this to the user if we can avoid it (including |
Numbers are currently just numbers as far as the user can tell (sort of, see below), which is nice. Internally, they are either integers or floats, and this leaks out a little bit. I think shielding users from this distinction is the right thing to do, and for most uses (i.e., in geometry) decimals are the right thing to have. There are some rough edges or things we should properly define though.
Here are questions I think we need to solve sooner rather than later (before 1.0 IMO):
int
(but only sometimes) is bad. If we allow arbitrary conversion though, how do we handle rounding? If the number being rounded is0.3333333
it is likely the user didn't intend for this number to be used as an integer, so maximum flexibility is perhaps not the best solution. Do we apply a delta as for comparison? Are there any times the user might want to specifically do integer arithmetic? How do we support that?0.00000001 == 0
true0.00000001
is that printed like that? as0
or as0.00
(or some other number of significant figures or decimals)? Whilst this is mostly a UI issue, we should ensure that how numbers are displayed corresponds with there semantics (e.g., if0.00000001
is printed as0
, then0.00000001 == 0
should be true).And more long-term questions:
The text was updated successfully, but these errors were encountered: