Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisit float issue #64

Open
hax opened this issue Mar 19, 2023 · 61 comments
Open

Revisit float issue #64

hax opened this issue Mar 19, 2023 · 61 comments

Comments

@hax
Copy link
Member

hax commented Mar 19, 2023

Currently, the range proposal supports all floating-point numbers. I briefly mentioned the precision issue of floating-point numbers in #57:

it's rare and always have precision-related issue to iterating floats,
so many programming languages do not support iterating floats in the core lib

We discussed this issue again at last week's JSCIG meeting, and @Jack-Works suggested me to open a separate issue since he thinks it's worth further consideration.

In short, there is always a precision issue with floating-point numbers.

For example, range(0, 10, 3) produces 0, 3, 6, 9, while range(0, 1, 0.3) produces 0, 0.3, 0.6, 0.8999999999999999.

There are several problems:

First, precision loss can occur with both addition and multiplication algorithm (current draft use multiplication).

Second, due to precision loss, the last value may be slightly less than end and included in the result. For example, range(0, 0.9, 0.3) produces four numbers.

Third, precision issues are sporadic, meaning that developers may write a "workable" program but easily encounter unexpected results when the parameters are slightly modified. For example, range(100, x, 0.3) works fine for any integer x below 166, until 166, where it becomes problematic.

Note that I think we shouldn't simply regard the precision issue of floating-point numbers as a developer education issue. Instead, we should consider under what circumstances users need to use floating-point numbers. If there are no reasonable use cases for floating-point numbers in the range, it means that using floating-point numbers almost always produces unexpected results in practice. Ultimately, MDN have to educate developers not to use floating-point numbers in the range. On the other hand, current toolchains (like TS or ESLint) don't have sufficient capabilities to warn developers against improper use of floating-point numbers during development, meaning we cannot provide protection in engineering. Therefore, it's better not to support floating-point numbers from the beginning.

So do we have enough solid use cases for the range of floating-point numbers? Based on my personal experience, it seems dubious.

For example, in the example of range(0, 0.9, 0.3), I feel that most developers would expect to get the result 0, 0.3, 0.6, meaning that what developers actually want is not floating-point numbers, but fixed-point numbers or decimals. Before decimal is introduced into JS, the most reasonable way to handle such use cases is to write range(0, 9, 3).map(x => x / 10).

There may indeed be some cases where floating-point numbers are used and precision is not cared for. For example, when drawing charts, we sample several points in a certain interval on the x-axis, calculate the corresponding y values for these points, and draw them. However, in such cases, a more straightforward way is to specify the number of samples in the interval, rather than the step. Therefore, an API similar to the linspace function provided by NumPy would be more suitable. Of course, we can also achieve the goal through range(0, sampleCount).map(i => start + (end - start) * i / sampleCount), and in this way, range also does not need to support floating-point numbers. (see a much real chart example: https://jsfiddle.net/f5xsopmn/ )

And, if we do not support floating-point numbers, we can also avoid the special treatment in the current proposal for special value like infinity, as well as avoid peculiar behaviors outside the safeint range (such as iteration may producing repeated numbers). Note previously #7 discussed such behavior and one suggested solution is separate float support to range(start, end, {useFloat: true}).

Finally, assume we only support integers in range API (throw TypeError if param is not safeint or bigint), theoretically we could upgrade it to support all float in the future, if we really want.

--

Prior arts:

Python range() only support integer, there are some discussions in stackoverflow, see:

@Andrew-Cottrell
Copy link

Andrew-Cottrell commented Mar 19, 2023

Related: #58 (comment), which is also about limiting support to integers rather than all JavaScript number values.

Apparent 'precision issues' are due to the understandable but incorrect intuition that floating-point numbers behave like fixed-point decimal numbers. But this 'precision issue' has potential to occur in any JavaScript API that permits use of unrestricted number values, and also in many other languages that have float or double types. I don't think it's worthwhile to single out one JavaScript API (such as range) by limiting its utility for the purpose of 'protecting' users. Anyone using non-integer number values is going to potentially run across situations that require them to be aware that floating-point is different to fixed-point.

0.3 + 0.3 + 0.3 === 0.9; // false (tested in Microsoft Edge Version 106)

I think we could simply regard the precision issue of floating-point numbers as a developer education issue because floating-point is widely used in many programming languages, spreadsheets, data exchange formats, and data transfer protocols. At the very least, I don't think it is worthwhile for the range proposal to change the status quo in this regard.


Aside: Other approaches might help with the 'precision issues' more generally across all JavaScript operators and APIs.

For example, I use a roundToDecimal function that helps in some cases where fixed-point decimal numbers are more appropriate

roundToDecimal(0.3 + 0.3 + 0.3, -1) === 0.9; // true

Array.from(range(0, 1, 0.3)).map(n => roundToDecimal(n, -1)); // [0, 0.3, 0.6, 0.9]

@hax
Copy link
Member Author

hax commented Mar 20, 2023

@Andrew-Cottrell I think I already explained why we'd better not treat it as simple education issue.

But this 'precision issue' has potential to occur in any JavaScript API that permits use of unrestricted number values

What we are discussing is just whether making range permit unrestricted number values is a good idea. Note, some JS APIs already only allow integers. For example, new Array(len) will throw if len is non-integer number.

and also in many other languages that have float or double types.

Many other languages only support integer in their range or similar api even they have float type. See https://github.com/tc39/proposal-Number.range/blob/main/compare.md#support-decimal-steps-01-02-03-

Array.from(range(0, 1, 0.3)).map(n => roundToDecimal(n, -1)); // [0, 0.3, 0.6, 0.9]

roundToDecimal can't solve this issue:

Second, due to precision loss, the last value may be slightly less than end and included in the result. For example, range(0, 0.9, 0.3) produces four numbers.

@ljharb
Copy link
Member

ljharb commented Mar 20, 2023

However, newer APIs won't throw: new Float32Array(1.2), Array.from({ length: 1.2 }).length, etc.

@hax
Copy link
Member Author

hax commented Mar 20, 2023

off-topic: I'm curious why introduce inconsistency between new Array(len) and new TypedArray(len), is there any historical discussion I could read?

@ljharb
Copy link
Member

ljharb commented Mar 20, 2023

No idea - Typed Arrays weren't created by TC39, more absorbed. Array.from, however, was.

@bakkot
Copy link
Contributor

bakkot commented Mar 20, 2023

Hm, I'm torn on this.

On the one hand, floating point numbers make perfect sense here - it's not like new Array where a non-integer is just nonsense. (I think we should have been, and should be, much more aggressive about rejecting non-integer arguments to APIs which expect integers - but here a non-integer value works fine.)

On the other hand, Iterator.range(0, 0.9, 0.3).toArray() giving four values (0, 0.3, 0.6, 0.8999999999999999) is pretty bad.

@Andrew-Cottrell
Copy link

Andrew-Cottrell commented Mar 20, 2023

roundToDecimal can't solve this issue:

Agreed; it's helpful in only some cases.

On the other hand, Iterator.range(0, 0.9, 0.3).toArray() giving four values (0, 0.3, 0.6, 0.8999999999999999) is pretty bad.

In this particular case, the problem occurs when the (end - start) value is exactly or very close to a multiple of the step value. The problem can be avoided by either ensuring the end value is suitably distant from the critical value (e.g., subtract some appropriate epsilon) or by filtering the desired values (e.g., exclude the value that is too close to the end value). To me, it seems like the same sort of caution that is sometimes required when using floating-point algorithms that may accumulate error.

I think most (maybe 95-99%) use cases for Iterator.range would stick with integers, so perhaps most consumers would not encounter the accuracy problems. My thinking is that there are potential bad results with other operators and APIs when using floating-point without taking appropriate care, such as the Math.hypot implementations do. It seems, to me, that the choice is to either disallow the floating-point use cases or to allow them and leave people with those use cases to cope with the accuracy problems. But people already have to cope (one way or another) with 0.1 + 0.2 !== 0.3.

@Jack-Works
Copy link
Member

I'll bring this topic to the committee and see how others think of

@erights
Copy link

erights commented Mar 21, 2023

Allow only safe integers or bigints. With safe integers, one way to express what we assume the example above is trying to express is

Iterator.range(0, 9, 3).map(x => x/10).toArray()

This has none of the hazards, and obviously does not.

@tabatkins
Copy link

This is not a footgun we need or even want to protect people from. Ranges that include floating-point numbers are perfectly fine and reasonable, and people will want to do them - if we arbitrarily restrict it from this API people will just write their own and be justifiably angry at us for screwing up the API and forcing them to reinvent it themselves. (And they'll probably make the common mistakes we've avoided, like using addition rather than multiplication so they accumulate error.)

Yes, floating-point comparisons are, pardon my french, fucked. But (a) that's true regardless of what we do or don't do, (b) the pain of making the mistake is very limited (just one extra item getting produced), and (c) it's trivial to fix (slightly reduce the end point, by some arbitrary amount smaller than the step).

Sometimes users would be able to rewrite their range into an integer-using version, followed by a map that produces the actual numbers they want. Sometimes they won't be able to - if you want multiples of pi below 20, for example, you'd be SOL.

Forcing people to write their own function* goodRange(...) {...} that suffers from the exact same "problem" that our built-in would doesn't help anyone. They'd just hit the exact same error, and fix it the exact same way, except now they're mad.

@erights
Copy link

erights commented Mar 22, 2023

Me:

Allow only safe integers or bigints.

@tabatkins :

This is not a footgun we need or even want to protect people from.

I do not disagree. If everyone would rather not impose the integer only constraint, then I'm fine just doing the simple thing, without trying to otherwise kludge-fix the "anomaly". But in explanatory material, it would be good to point out the hazard, and to show how the integer-plus-map pattern, when applicable, fixes it.

@hax
Copy link
Member Author

hax commented Mar 22, 2023

Ranges that include floating-point numbers are perfectly fine and reasonable, and people will want to do them - if we arbitrarily restrict it from this API people will just write their own

And they will find why it's a footgun.

and be justifiably angry at us for screwing up the API and forcing them to reinvent it themselves.

I don't see "justifiably angry" in python community. On the contrary, the stackflow links i posted shows they see it as correct design to avoid footgun.

(And they'll probably make the common mistakes we've avoided, like using addition rather than multiplication so they accumulate error.)

As I understand, the reason current draft use multiplication is not to avoid accumulate error, it's for avoid dead loop.

@hax
Copy link
Member Author

hax commented Mar 22, 2023

Ranges that include floating-point numbers are perfectly fine and reasonable

I always think "Range" is a misleading word.

Range of float is perfectly fine, but iterable range of float is not "perfectly" fine. Note, Swift allow 0.0..9.0 to create float range but only allow iterate 0..9.

@Jack-Works
Copy link
Member

As I understand, the reason current draft use multiplication is not to avoid accumulate error, it's for avoid dead loop.

It's for accumulate errors. Dead loop is also comes from the accumulate error (MAX_SAFE_INT + 1)

@Andrew-Cottrell
Copy link

Andrew-Cottrell commented Mar 23, 2023

Regarding Iterator.range(0, 0.9, 0.3).toArray() giving four values (0, 0.3, 0.6, 0.8999999999999999)

This isn't an argument against or in favor of the float issue, but I thought it might be be worth noting here.

In testing my existing implementation of a range function, I found that I could not reproduce the problem above. The reason is that the implementation is a little different to that corresponding to this proposal's specification.

/**
 * Yields numbers in the half-open interval from <var>start</var> (inclusive)
 * to <var>end</var> (exclusive) using a <var>step</var> size.
 * @param {number} start - The first number to yield.
 * @param {number} end - The number that should not be hit or passed.
 * @param {number=} step - The step size, defaults to <code>Math.sign(end - start)</code>.
 * @return {!IteratorIterable<number>} An iterator iterable that yields the numbers.
 * @throws {!RangeError} When the parameters do not define a valid range.
 */
function* range(start, end, step) {
    const rangeSign = Math.sign(end - start);
    const computedStepSize = step || rangeSign;
    const totalStepCount = Math.ceil((end - start) / computedStepSize);
    let remainingStepCount = totalStepCount;
    if (
        !Number.isFinite(computedStepSize) ||
        (rangeSign && rangeSign !== Math.sign(computedStepSize))
    ) {
        throw new RangeError("Invalid arguments (" + start + ", " + end + ", " + step + ")");
    }
    while (remainingStepCount) {
        let value = start + computedStepSize * (totalStepCount - remainingStepCount);
        remainingStepCount -= 1;
        yield value;
    }
};

Array.from(range(0, 0.9, 0.3)); // [0, 0.3, 0.6]
Array.from(range(0, 0.000009, 0.000003)); // [0, 0.000003, 0.000006]

I expect there remain bad results for some inputs, but using and pre-computing totalStepCount to be the ceiling of the division appears to make the implementation a little more robust and predictable.

I'm not suggesting anything to be changed in the present proposal, but I thought someone might find this approach interesting.


Edit: 2023-04-22

A potentially more robust implementation of the above computes

const totalStepCount = computeTotalStepCount(start, end, computedStepSize);

where computeTotalStepCount is defined as

/**
 * Computes the total number of steps required to go from start to end in increments of step.
 * @param {number} start
 * @param {number} end
 * @param {number} computedStepSize
 * @returns {number} The total step count.
 */
function computeTotalStepCount(start, end, computedStepSize) {
    let count = (end - start) / computedStepSize;
    if ((count - Math.floor(count)) < (Math.abs(computedStepSize) / Math.pow(2, 16))) {
        return Math.floor(count);
    }
    return Math.ceil(count);
}

The choice of Math.pow(2, 16) is arbitrary but seems to work well in limited testing. Someone with good knowledge of floating-point may know or find a much better JavaScript implementation of computeTotalStepCount.

The implementation above fails the following test case at computeTotalStepCount(0, 9999999999999.9, 0.3), which returns 33333333333334 rather than 33333333333333.

function testNines() {
    for (let i = 0; i < 16; i += 1) {
        let end = Number("9".repeat(i) + ".9");
        let count = computeTotalStepCount(0, end, 0.3);
        if (!String(count).endsWith("3")) {
            console.log("computeTotalStepCount(0, " + end + ", 0.3) === " + count);
            return;
        }
    }
    console.log("all tests passed");
}

A BigDecimal, or similar, implementation of computeTotalStepCount might be able to do much better and compute the correct value in all cases. Based on this example of PI computation in JavaScript, I guess some people are clever enough to find a solution.


Edit: 2023-09-07

This comment was previously collapsed (using a details element) as I thought it might be off-topic, but the discussion currently involves some of the ideas mentioned above, so I have expanded this comment.

@tabatkins
Copy link

And they will find why it's a footgun.

Discovering that floats are problematic to work with does not negate the need to work with floats. This isn't a case where someone can make an unthinking mistake using the built-in, but they'd realize it and avoid the mistake writing it themselves. If you write your own range function and use a non-integer step, or even just do a for-loop and accumulate a list manually, you'll make the exact same mistake for the exact same reason, and you'll fix it in the exact same way.

This issue doesn't affect anyone using integers, or even using dyadic rationals (fractions with powers of 2 as their denominator), so banning this doesn't make the feature better for the remaining use-cases. It solely affects non-dyadic floats, which are problematic no matter how you deal with them. Again, we're simply not helping authors by preventing them from doing this.

I expect there remain bad results for some inputs, but using and pre-computing totalStepCount to be the ceiling of the division appears to make the implementation a little more robust and predictable.

There might be some FP-error-related reason why this is slightly more accurate, yielding the "expected" integer value some times rather than a slightly-off-from-integer number, but it's definitely not a reliable difference. (0, .9, .3) yields 3 iterations, but (0, 99.9, .3) yields 334, still suffering from the issue.

@Andrew-Cottrell
Copy link

Andrew-Cottrell commented Mar 25, 2023

Maybe there is a way — perhaps available only to native implementations and probably not using double-precision floating-point calculations — to pre-compute totalStepCount (or something similar) such that the one-extra-item issue is avoided in all cases. However, even if that is possible, I am not suggesting this particular approach or arguing that any attempt to mitigate the one-extra-item issue is desirable or worthwhile. I mention the idea only for consideration. I understand that, while this approach might make things more predictable for some who are less familiar with floating-point, it may make things less predictable for those with more floating-point experience; and there could be specification, implementation, or performance reasons why this approach is impractical, even if it were possible and might be desirable for some people. In addition, this approach may be undesirable if it is impractical to write polyfill implementations, or if the native implementation were too different to existing JavaScript library implementations and the native & library implementations of other programming languages.

@jridgewell
Copy link
Member

Would the count option help here? That would allow you to range(0, 99.9, { step: .3, count: Math.floor(99.9 / .3) }) and iterate only 333 items.

@hax
Copy link
Member Author

hax commented Apr 4, 2023

@jridgewell I think it help to solve the float issue, but to be honest, I feel in general floating use cases are different, a separate API like linspace seems much clear.

(Note my initial idea of count option is just to explore the possible options to make sure it's good to make the third param support option bag.)

@papb
Copy link

papb commented Jun 25, 2023

Regarding the idea of throwing an error and then adding support later, isn't there an unfortunate risk of people writing code that relies on the fact that it throws on floats, and then if in the future support is added, we would essentially be breaking those people's codes...

@ljharb
Copy link
Member

ljharb commented Jun 25, 2023

That’s generally been a concern we ignore - an error can’t be relied on to stay an error.

@lino-levan
Copy link

Perhaps we should consider integration with the BigDecimal proposal (https://github.com/tc39/proposal-decimal)?

@ogbotemi-2000
Copy link

About time this issue ain't closed at all.
Not to point out the obvious but it is an Iteration man @tabatkins
Floats do not correctly increment for numbers that aren't exactly a power of 2 for obvious reasons.

One has to wonder what likely use they may find in actual iterations and not in the code of some developer looking for holes in ECMAScript implementations.

@ogbotemi-2000
Copy link

ogbotemi-2000 commented Sep 5, 2023

If the trouble of converting floats to ints is making some devs post preposterous arguments supporting the use of floats in Number.range, the suggestion below may help bring logic into such sentiment

(5.99999|0) === 5 //true

Floats can be easily handled thereof

@Jack-Works
Copy link
Member

I found @Andrew-Cottrell 's suggestion very interesting. Precompute steps by division, maybe a way to solve the problem of range from 0 to 0.9 with step 0.3. It maybe a good way to go, we just need to discover the edge case of this idea. (maybe also learn from other libraries/languages)

@tabatkins
Copy link

As I mentioned in my comment, there is no way to precompute via division that will avoid this error in all cases. Dividing .9 by .3 yields 3, but dividing 99.9 by .3 yields 333.00000000000006, which'll ceiling up to 334.

The "problem" here is intrinsic to the nature of floating point. It cannot be avoided, by authors or by UAs; whether we allow fractional steps or force authors to do their math in integers and then convert to the fractional step at the end, they'll still run into the exact same problem. Nobody can be saved from floating point precision issues.

@Jack-Works
Copy link
Member

Dividing .9 by .3 yields 3, but dividing 99.9 by .3 yields 333.00000000000006, which'll ceiling up to 334.

So why ceiling up? We can drop the decimal part and that will be the correct result

@lino-levan
Copy link

or force authors to do their math in integers and then convert to the fractional step at the end, they'll still run into the exact same problem.

I don't see how this could possible be true. Could you give an example of this?

I'm pretty strongly of the opinion that the initial proposal should not support floating point numbers. If developers prove it necessary it could always be added after the fact.

@tabatkins
Copy link

I don't see how this could possible be true. Could you give an example of this?

"Do it in integers then map it" means you have to precompute how many iterations you want. That operation suffers the exact same FP error as anything else; the exact operation you perform will determine exactly which cases have an off-by-one error, but it is guaranteed that some cases will. This has already been demonstrated a few times in this thread; dividing the width of the range by the step size will sometimes, due to FP shenanigans, slightly overshoot an integer when an integer was intended.

@tabatkins
Copy link

Just to clarify, I was indicating that BigDecimal or BigFloat could be used only to precompute the number of steps (to solve only the one-extra-item issue). All other calculations would continue to use double-precision floating-point, with the expected ambiguities and inaccuracies.

I'm not certain this would work, either. The engine can't know ahead of time that a given value is going to be used as a range endpoint/step, so it'll still store the value as a traditional Number internally. You can convert an FP number back to a decimal number (that's how we print them, after all), but it's not guaranteed to be the same decimal you originally wrote. (I will admit, tho, that this case is probably far below the threshold of caring about; the numbers you have to write to trigger this are pretty ridiculous.)

When JS does grow a BigDecimal type (https://github.com/tc39/proposal-decimal), then a range that takes decimals will indeed solve this problem "correctly". But I don't think it's reasonable, at the moment, to gate us on this. After all, even in integers a very similar problem occurs, due to loss of precision in large numbers; that doesn't mean we restrict ranges to only being usable with BigInts.

@hax
Copy link
Member Author

hax commented Sep 8, 2023

"Do it in integers then map it" means you have to precompute how many iterations you want. That operation suffers the exact same FP error as anything else;

@tabatkins I may misunderstand, but why "do it in ints then map it" require precompute how many iterations?

For example, instead of trouble range(0, 0.9, 0.3), range(0, 9, 3).map(x => x/10) just work as expected.

@tabatkins
Copy link

tabatkins commented Sep 8, 2023

And how did you convert .9 and .3 into integers? You knew ahead of time that they were both specified in tenths, and so could do a manual conversion. You cannot do that in general, when the start/end/step might be from variables. Even when the arguments are literals, you still can't reasonably do so if the numbers aren't short, easy decimal rationals; trying to get multiples of pi between 0 and 20, for instance.

(Even when they're short decimal literals, it quickly becomes fiddly enough to invite errors - range(2, 10, .125) has to become range(2000, 10000, 125).map(x=>x/1000); that's a lot of 0s to have to track and if you mess it up it might not be immediately obvious. And that's a lucky case - counting up by 1/8ths is a nice finite decimal. Counting up by, say, 1/6ths again puts us into the "basically infeasible" area, or requires even less obvious and more error-prone conversions, like range(12, 60, 1).map(x=>x/6) for the same range.)

If they are coming from variables, the only way to convert them into ints is to precompute the iterations with Math.ceil( (end - start) / step ), then do range(0, iterations).map(x=>x*step + start), and that the exact same FP precision issues as just allowing floats in the first place. (Also you need to handle possible sign issues, but that's just some fiddliness in the code; it doesn't contribute meaningfully to the issues here.)

@hax
Copy link
Member Author

hax commented Sep 12, 2023

And how did you convert .9 and .3 into integers?

The point is , in the common use cases, people never really want floats, but decimals (or rationals as your 1/6th example). So yes, just as you say, "they know ahead of time that they were both specified in tenths, and so could do a manual conversion".

You cannot do that in general, when the start/end/step might be from variables

If we choose to only support integers, people of coz can only provide integers or decimals/rationals which can be converted ahead of time.

Actually variables are more risk to trigger floating footguns --

Third, precision issues are sporadic, meaning that developers may write a "workable" program but easily encounter unexpected results when the parameters are slightly modified. For example, range(100, x, 0.3) works fine for any integer x below 166, until 166, where it becomes problematic.


trying to get multiples of pi between 0 and 20, for instance.

For Iterator.range(0, 20, Math.PI) case, if we only allow integers, yes, Iterator.range(0, Math.ceil(20 / Math.PI)).map(x => x * Math.PI) is bad. For such case, people could write

// assume we will have `takeWhile` helper in the follow-on proposals of iterator helpers
range(0, Infinity).map(x => x * Math.PI).takeWhile(x => x < 20)

// if only support integers, the method could even have defaults so just write
integers().map(x => x * Math.PI).takeWhile(x => x < 20)

Not very concise, but still one-line and easy to read and understand.

range(2, 10, .125) has to become range(2000, 10000, 125).map(x=>x/1000); that's a lot of 0s to have to track and if you mess it up it might not be immediately obvious.

I don't think such inconvenience could compare to real footguns we discussed. And It's easy to use underscore to mitigate the issue:

range(2_000, 10_000, 125).map(x => x / 1_000)

About 1/6ths case, u could just write

range(2*6, 10*6).map(x => x/6)

Generally speaking, for all step is rational , we could just multiple the denominator:

range(start*denominator, end*denominator).map(x => x/denominator)

If they are coming from variables, the only way to convert them into ints is to precompute

If coming from variables, there is no general way to "convert them into ints", because you don't know the denominator (or how many digits of decimals).

the only way to convert them into ints is to precompute the iterations with Math.ceil( (end - start) / step ), then do range(0, iterations).map(x=>x*step + start), and that the exact same FP precision issues as just allowing floats in the first place.

So I agree such precomputing doesn't make sense and solve nothing. The simple solution is just write iterator manually, or use integer-only range + iterator helpers:

integers().map(x => start + x*step) // if infinite sequence
integers().map(x => start + x*step).takeWhile(x => x < end)

or use some other lib (like linspace which might be the better abstraction in many floating cases).

Note, such solutions might still face floating issues, but we separate them with the most common integer cases. And one thing I expect is when people learn integer-only range, they could learn the reason why it's integer only and aware the floating issues and take care when they do them manually.

@tabatkins
Copy link

tabatkins commented Sep 14, 2023

integers().map(x => start + x*step).takeWhile(x => x < end)

This still suffers from the exact same problem that's been brought up this entire thread! start + x*step can be unexpectedly an epsilon above or below the end value!

Like I've said over and over again, we're not helping authors by disallowing this. There is nothing they can do, in the general (and reasonably common) "range set up with variables" case, to avoid the issue. You're just suggesting that, if authors want fp ranges, they should manually implement the exact same behavior that already exists for integer ranges. How is it helpful to tell authors "do exactly what the spec already does, but which we're disallowing from working"?

Just because you find the fp issues distasteful doesn't mean we should disallow them. It can be reasonable to to avoid baking something into the spec if there are a few reasonable ways that authors can "correctly" handle things and there's no way we can decide ahead of time which to use, but that's not the case here. There are no correct ways to handle it. Every single suggestion for how authors to handle it suffers from the exact same issues that the spec would face if it adopted this behavior natively.

I'm not sure how to say this differently, but people keep suggesting the exact same broken code for authors to use and claiming that this means we don't need it in the spec.

@hax
Copy link
Member Author

hax commented Sep 19, 2023

we're not helping authors by disallowing this.

We are definitely helping authors by disallowing float.

there's no way we can decide ahead of time which to use, but that's not the case here.

Actually, it is indeed the case here.


We compel authors to reconsider their true needs -- whether they require fixed point numbers, decimals, or rationals, and encourage them to utilize the strategies we've discussed to evade potential pitfalls. In rare instances, if floating point numbers are required -- it implies that authors acknowledge the potential precision inaccuracies and are willing to overlook possible edge cases. Given that such cases are scarce compared to the usage of integers, it is acceptable to manually implement the behavior. It's important to note that many built-in APIs and platform APIs have numerous such non-generic methods.

A counterexample to consider is the introduction of iterable strings, which has led to various footguns. The committee made a misstep by assuming it would be beneficial to apply similar iteration behaviors to all indexed collections, a decision that many people have since regretted.

Indeed, just like many issues, this is a contentious issue, but let's try to empathize with the pain that inexperienced programmers might experience when they inadvertently misuse APIs and fall into traps. Furthermore, if we really want to allow programmers to use floating point numbers in ranges, it could be through explicit parameters, such as range(start, end, {allowFloat:true}).

Thank you.

@ogbotemi-2000
Copy link

ogbotemi-2000 commented Apr 9, 2024

Hello hax, here is a better reason to disallow any support of floats in Iterator.range from your post:
#64 (comment)

The result of Array.from(Iterator.range(start, end))) can be further .maped by a function that divides each element of the array by some number to get a float for each thereof.
Basically, a programmer trying to use an array of floats will have non-generic needs as regards the choice of the divisor in the fraction that represents each float hence why not supporting floats internally is the proper approach for a generic method such as this.

@tabatkins
Copy link

This exact comment has been posted multiple times in this thread. See my previous comment for a response.

@ogbotemi-2000
Copy link

Very well then. The OP may then close this issue...

@Jack-Works
Copy link
Member

I revisited the meeting notes, @erights also concerned about floating point number. This is also a blocking concern of this proposal. I wonder if we can disallow floating point number first and preserve the possibility to add it back in the future, and if disallowing it harms the developer experience.

@tabatkins
Copy link

I'll have to review the notes, but was @erights's feedback materially different from what was already presented above, or was it a repeat of the "the range can be unexpectedly off-by-one" complaint?

As I've argued repeatedly in this thread, there is literally nothing an author can do to avoid the FP issues inherent in fractional ranges. It doesn't matter whether we allow them naturally, or we only allow integer ranges and force authors to convert their fractional ranges into integers and then map the iterator back into fractions; if they do the latter, and the fraction isn't a statically known rational number, they'll suffer the exact same FP precision issue and footguns that are being complained about. (That is, if the step is irrational, or just variable at runtime, they'll hit the exact same FP problems, and sometimes be off-by-one.)

The only way to avoid FP issues with fractional ranges is to just never use fractional ranges at all, for any purpose. And that's not a realistic prohibition.

@Rudxain
Copy link
Contributor

Rudxain commented Oct 28, 2024

IDK if it has been mentioned yet, but I've been doing some mental-math to predict when an "overflow repeat" happens (I haven't tested the polyfill, yet. Only my own implementation), and I came to this conclusion:

// https://es.discourse.group/t/error-assert/356
assert(Number.isFinite(step) && step > 0)
// `step` can be non-`Int`

// this example could work with many `Int` `start` values,
// but this also depends on the value of `step`,
// so I used `0` for simplicity
const g = range(0, Infinity, step)

// predicated `drop`
while (g.next().value != g.next().value);

assert(g.next().value == Math.trunc(Number.MAX_SAFE_INTEGER * step) + 1)

Those expressions might seem "simple", but it would be confusing (at first) to get different "overflow repeats" depending on which step has been chosen. If start was an arbitrary number, it would be much more confusing.

I don't like the fact that range could act like a repeat generator. An infinite stream of x should only be generated by a repeat-like generator

@Rudxain
Copy link
Contributor

Rudxain commented Oct 28, 2024

BTW:

// this should `throw`
Iterator.range(0, Infinity, {inclusive: true})
// this should **definitely** `throw`
Iterator.range(0n, Infinity, {inclusive: true})

@tabatkins
Copy link

I don't like the fact that range could act like a repeat generator. An infinite stream of x should only be generated by a repeat-like generator

I'm not entirely sure what your example is showing there. At some point any range using Number will get repeats, depending on the size of the step, yes. It'll never infinitely repeat; when it finally hits Infinity the range will stop.

// this should `throw`
Iterator.range(0, Infinity, {inclusive: true})
// this should **definitely** `throw`
Iterator.range(0n, Infinity, {inclusive: true})

Could you explain why the first should throw? It looks like it'll stop and include the end point, exactly as requested.

The second will already throw, I think, because the first argument is a BigInt and the second is a Number. They have to be consistent.

@Rudxain
Copy link
Contributor

Rudxain commented Oct 29, 2024

It'll never infinitely repeat; when it finally hits Infinity the range will stop.

If it's a multiplicative implementation then that's true. But my implementation was additive, which was prone to getting stuck.

I've just read the draft spec and noticed it's multiplicative! I think it's fine if the finitely-repeated values "jump" to the next representable Float64, instead of getting "infinitely stuck".

why the first should throw?

I thought the spec was additive, but now that I know it's multiplicative it makes more sense!

Mathematically speaking, it's wrong (but not impossible) to include ∞ as if it were a Real-Number (small tangent: No, it's not the cheap argument of "Infinity is a concept". All numbers are concepts, but ∞ represents every Complex-Number instead of a concrete value).

However, for IEEE-754 (any radix), it's possible to "jump to Infinity", and Infinity is a valid number that represents "a value whose order-of-magnitude is so high that it cannot be represented by a sequence of n-bits" (where n is the fixed-precision of exponent and mantissa concatenated).

The second will already throw

It doesn't in the playground. Is it outdated WRT the spec?

They have to be consistent

I agree, but BigInts are so much more appropriate for unbounded ranges, so it's fine to allow Infinity (although this could be changed to some other "special" value, such as null or undefined), as long as the range is non-inclusive (for the mathematical reasons I've mentioned)

@Jack-Works
Copy link
Member

I revisited everyone's point and made a summary here:

  1. There are 3 kinds of foot guns FP brings.
    1. Inprecise number: Iterator.range(0, 1, 0.3).toArray() yields 0.8999999999999999, not 0.9
    2. One more item than expected: Iterator.range(0, 0.9, 0.3).toArray() yields 4 items instead of 3.
    3. Repeated items: Iterator.range(9, 9+2e-15, 1e-16) yields [ 9, 9, 9, 9, 9, 9, 9, 9, 9 ] (it's finite!).
  2. @hax (and @lino-levan) think there are no real use cases for using FP in range. @tabatkins against this. I'll do a corpus research to see if there is one.
  3. @hax and @erights believe banning FP helps developers avoid errors, @tabatkins said developers will be angry about this because they have to implement range again and be hit by the same foot gun. I'm convinced by @tabatkins on this point.
  4. @Andrew-Cottrell mentioned a new algorithm that pre-calculates total steps and tries to prevent the one-more-item problem (1.2), I haven't experimented with this yet.
  5. @Andrew-Cottrell also mentioned Epsilon. This is also useful to prevent issue 1.2 in most use cases I can think of, but I haven't experimented with this yet.

I don't know if we should try non-trivial solutions to solve those problems, e.g. Epsilon or rounding behind the scenes, but if we can, here is what we can try:

  1. Inprecise number: Rounding numbers with a reasonable default (which means one new config!) so that Iterator.range(0, 1, 0.3).toArray() yields [0, 0.3, 0.6, 0.9]
  2. One more item than expected: Introduce Epsilon or the algorithm @Andrew-Cottrell mentioned.
  3. Repeated items: A built-in deduplicate? That may sound too smart.

@Jack-Works
Copy link
Member

The second will already throw, I think, because the first argument is a BigInt and the second is a Number. They have to be consistent.

(replies to @tabatkins)

range(0n, Infinity) is a valid special case in a bigint range, so an endless range iterator of bigint is possible and you don't have to write range(0n, 99999999999999999999999999999999999999999999999999999999n).

@shaedrich
Copy link

  1. Inprecise number: Rounding numbers with a reasonable default (which means one new config!) so that Iterator.range(0, 1, 0.3).toArray() yields [0, 0.3, 0.6, 0.9]

Treating floats as if they were integers even though they are not does not sound like a very consistent idea 🤔

@tabatkins
Copy link

range(0n, Infinity) is a valid special case in a bigint range, so an endless range iterator of bigint is possible and you don't have to write range(0n, 99999999999999999999999999999999999999999999999999999999n).

Ah, lol, that makes sense. Thanks for the reminder; I hadn't read the spec in detail recently so didn't realize there was a special case for that.

@tabatkins
Copy link

So, with that correction in mind, we can return to the question of "why should it [Iterator.range(0n, Infinity, {inclusive:true})] throw?"

Yes, it is the case that we know, for a fact, it can never include the endpoint, since it has to produce BigInts and Infinity is a Number, instead. And more directly, BigInt is arbitrary-size; it doesn't have the concept of infinity in its data type.

But also, any loop that is sufficiently large is, in practice, an infinite loop. Even tho Iterator.range(0, Infinity, {inclusive: true}) will theoretically eventually hit Infinity and return that as its final value, in practice that will take longer than the current age of the universe to occur. (Infinity, in IEEE-754 64-bit, is approximately 10^308. If you could process a billion entries from the iterator per second, that would still take 10^299 seconds; the universe has existed for less than 10^18 seconds.) Even the comparatively tiny Iterator.range(0, Number.MAX_SAFE_INTEGER, {inclusive: true}) won't complete in practice; under the same "billion operations per second" metric it would take weeks to finish.

So, detecting an "actual" infinite loop and throwing, while allowing all the "practical" infinite loops, doesn't seem to actually provide any value.

And more specifically, for this exact case, as we see the spec in fact special-cases it to work on purpose, so that you can indicate an infinite BigInt range without having to faff about with manually writing a sufficiently large number. It's a nice simple flag that clearly indicates the intent - this range will have values pulled off of it piece-by-piece, rather than being iterated to completion.

@Rudxain
Copy link
Contributor

Rudxain commented Dec 5, 2024

Infinity is a Number, instead. And more directly, BigInt is arbitrary-size; it doesn't have the concept of infinity in its data type.

Also:

Number.isInteger(Infinity) //false

If infinities aren't ints in the context of Float*/Number, then they certainly aren't valid BigInts.

any loop that is sufficiently large is, in practice, an infinite loop.

True!

So, detecting an "actual" infinite loop and throwing, while allowing all the "practical" infinite loops, doesn't seem to actually provide any value.

I agree. But my point isn't about loops, it's about correctness.

And more specifically, for this exact case

To be clear, I didn't meant to special-case range(0n, Infinity, {inclusive:true}) I meant range(n, inf, {inclusive:true}) where n is a BigInt and inf is any sign of Infinity.

so that you can indicate an infinite BigInt range without having to faff about with manually writing a sufficiently large number

But then what's the point? Why not just range(0n, Infinity)? It's simpler and more correct than range(0n, Infinity, {inclusive:true})

@ljharb
Copy link
Member

ljharb commented Dec 5, 2024

@Rudxain that's exactly what tab is suggesting - that you write range(0n, Infinity), and that it not throw.

@tabatkins
Copy link

Yeah, you don't need {inclusive:true}, of course. But throwing when it's included (because it won't include the ending value, by definition) doesn't seem to have a use-case.

@Rudxain
Copy link
Contributor

Rudxain commented Dec 6, 2024

I now understand why it feels like I'm proposing a special case for infinities, as the problem is way more general. Consider the following:

range(0, 3, {step: 2, inclusive: true})

Should it throw? If it doesn't then 3 won't be included, as the last value would be 4 (overshoot, not spec-compliant). So now the question isn't about "Should impossible infinite ranges be forbidden?" and is now "Should all impossible inclusive ranges be forbidden?" 🤔

@ljharb
Copy link
Member

ljharb commented Dec 6, 2024

no, of course it shouldn't. it's not a range that includes "3", it's a range that won't ever exceed 3.

@Rudxain
Copy link
Contributor

Rudxain commented Dec 6, 2024

I forgot the spec doesn't have overshooting behavior (AKA off-by-one) 😅

@Jack-Works
Copy link
Member

no, of course it shouldn't. it's not a range that includes "3", it's a range that won't ever exceed 3.

so do we need to change the name for this? inclusive might infer includes?

@ljharb
Copy link
Member

ljharb commented Dec 7, 2024

I don't think it's a likely source of confusion at all, but if someone has a good alternative option name, it's worth considering.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests