Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ispositive, etc. #53677

Open
wants to merge 45 commits into
base: master
Choose a base branch
from
Open

Add ispositive, etc. #53677

wants to merge 45 commits into from

Conversation

putianyi889
Copy link
Contributor

@putianyi889 putianyi889 commented Mar 9, 2024

It's faster for some complicated Real types.

julia> f(x) = x<=zero(typeof(x))
f (generic function with 1 method)

julia> g(x) = signbit(x)||iszero(x)
g (generic function with 1 method)

julia> @btime f(x) setup=(x=rand(BigFloat))
  99.055 ns (4 allocations: 144 bytes)
true

julia> @btime g(x) setup=(x=rand(BigFloat))
  10.700 ns (0 allocations: 0 bytes)
false

If you think this is a good idea then I'll add docstrings and tests.

Edit: FAQs according to discussions below

TODO:

@goerz
Copy link
Contributor

goerz commented Mar 9, 2024

For additional context: JuliaRegistries/General#102561

base/number.jl Outdated Show resolved Hide resolved
@mikmoore
Copy link
Contributor

mikmoore commented Mar 11, 2024

I'll just remark on ispositive, but the same goes for isnegative.

Needs NaN handling. For example,

julia> ispositive(NaN)  ispositive(-NaN)
true

A possible fix is in the flavor of ispositive(x) = !isnan(x) && !signbit(x) && !iszero(x). The default for non-float Number types is isnan(x) = false so this should be compiled away when irrelevant.

Also needs proper Missing behavior, ispositive(::Missing) = missing.

Now for your benchmarks (which appear to implement a "isnonpositive" function that does not appear in this PR, but I'll roll with it for now):

julia> using BenchmarkTools

julia> f(x) = x<=zero(typeof(x));

julia> g(x) = signbit(x)||iszero(x);

julia> g_fixed(x) = !isnan(x)&&(signbit(x)||iszero(x));

julia> h(x) = x <= false;

julia> h_int(x) = x <= 0;

julia> vi32 = rand(Int32,1000); vf64 = randn(Float64,1000); vbf = randn(BigFloat,1000);

julia> @btime count(f,$vi32); @btime count(g,$vi32); @btime count(g_fixed,$vi32); @btime count(h,$vi32); @btime count(h_int,$vi32)
  84.528 ns (0 allocations: 0 bytes)
  84.112 ns (0 allocations: 0 bytes)
  84.495 ns (0 allocations: 0 bytes)
  84.079 ns (0 allocations: 0 bytes)
  82.622 ns (0 allocations: 0 bytes)
517

julia> @btime count(f,$vf64); @btime count(g,$vf64); @btime count(g_fixed,$vf64); @btime count(h,$vf64); @btime count(h_int,$vf64)
  61.914 ns (0 allocations: 0 bytes)
  90.900 ns (0 allocations: 0 bytes)
  134.514 ns (0 allocations: 0 bytes)
  59.470 ns (0 allocations: 0 bytes)
  58.163 ns (0 allocations: 0 bytes)
489

julia> @btime count(f,$vbf);  @btime count(g,$vbf);  @btime count(g_fixed,$vbf);  @btime count(h,$vbf);  @btime count(h_int,$vbf)
  25.100 μs (2000 allocations: 101.56 KiB)
  12.900 μs (0 allocations: 0 bytes)
  14.000 μs (0 allocations: 0 bytes)
  54.900 μs (2000 allocations: 39.06 KiB)
  14.200 μs (0 allocations: 0 bytes)
498

Don't get hung up on differences of a few ns, which are probably benchmarking artifacts. But notice that this function is notably slower for native floats (and much slower when correctly handling NaN). And it doesn't out-do simple comparisons for BigFloat (that avoid creating a new BigFloat).

My preferred implementation would be h, but that implementation is so simple that I wouldn't introduce a function for it. But my benchmarks do show that we're missing good inequality handling for _comparison_(BigFloat, Bool), in that there's no reason h_int should be better than h in that case.

@giordano giordano added needs tests Unit tests are required for this change needs docs Documentation for this change is required labels Mar 11, 2024
@putianyi889
Copy link
Contributor Author

putianyi889 commented Mar 11, 2024

I'll just remark on ispositive, but the same goes for isnegative.

Thanks for the investigation! Looks like the problem is more complex than what I've imagined. Are you on the dev version of Julia? My benchmark results are a little different. count(h_int,$vbf) also has 2000 allocations. I'm on v1.10.2.

Anyway, the methods need different implementations for different types. I'm not sure which the fallback should be.

test/numbers.jl Outdated Show resolved Hide resolved
test/numbers.jl Show resolved Hide resolved
test/numbers.jl Outdated Show resolved Hide resolved
test/numbers.jl Outdated Show resolved Hide resolved
test/numbers.jl Outdated Show resolved Hide resolved
@simonbyrne
Copy link
Contributor

One thought: rather than defining these, would it be better to define a function which returns a Real or Number, say value?

@putianyi889
Copy link
Contributor Author

One thought: rather than defining these, would it be better to define a function which returns a Real or Number, say value?

How does that help?

@barucden
Copy link
Contributor

Don't forget to add a compatibility note for the added functions (they are likely to be a part of Julia 1.12.0).

test/numbers.jl Outdated Show resolved Hide resolved
@mikmoore
Copy link
Contributor

Would it make more sense for x > 0 (etc) to be the default implementation and the !isnan & !iszero & !signbit to be the specialization for BigFloat and maybe ::BigInt (basically, to swap the IEEEFloat specialization and the ::Any fallback?) As far as I can tell, BigFloat (and maybe ::BigInt) is the only Base type that will be handled better by this function.

I still think that it might be worth exploring faster inequality checks for BigFloat mixed with other types (at least Bool should be easy). It's a little more complicated to implement, but it could make these new functions unnecessary. I don't see these new functions being used especially often. They add a faster alternative to a specific BigFloat operation, but otherwise are equivalent to x -> x>0 etc. BigFloat is already very slow so I don't see this making a big performance difference in real code.

base/number.jl Outdated Show resolved Hide resolved
Co-authored-by: Mosè Giordano <[email protected]>
base/floatfuncs.jl Outdated Show resolved Hide resolved
@putianyi889
Copy link
Contributor Author

As far as I can tell, BigFloat (and maybe ::BigInt) is the only Base type that will be handled better by this function.

I can think of AbstractIrrational (although signbit has not yet been specialized) and Unsigned as well, so basically IEEEFloat is the only group where isnegative(x)=x<0 is strictly better (and SignedBitInteger is on par) and it's because of hardware support.

base/mpfr.jl Show resolved Hide resolved
Copy link
Contributor

@mikmoore mikmoore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch on consolidating ispos/isneg from BigInt.
The docstrings needed an update.
Consider adding a notice to news.md.

base/number.jl Outdated Show resolved Hide resolved
base/number.jl Outdated Show resolved Hide resolved
test/numbers.jl Outdated Show resolved Hide resolved
base/int.jl Show resolved Hide resolved
base/bool.jl Outdated
@@ -154,6 +154,7 @@ abs(x::Bool) = x
abs2(x::Bool) = x
iszero(x::Bool) = !x
isone(x::Bool) = x
ispositive(x::Bool) = x
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ispositive(x::Bool) = x
ispositive(x::Bool) = isone(x)

I can't imagine that this ever in the future of forever makes any difference. But I think this is arguably cleaner.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's simpler and clear to see that it's the identity function for booleans, but I guess the definition of isone is right there. You can also probably just let the compiler optimize the fallback.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also probably just let the compiler optimize the fallback.

The compiler doesn't do that at present.

julia> code_llvm(x->x>0, (Bool,))
;  @ REPL[6]:1 within `#5`
; Function Attrs: uwtable
define i8 @"julia_#5_5646"(i8 zeroext %0) #0 {
top:
; ┌ @ operators.jl:378 within `>`
; │┌ @ promotion.jl:462 within `<`
; ││┌ @ promotion.jl:393 within `promote`
; │││┌ @ promotion.jl:370 within `_promote`
; ││││┌ @ number.jl:7 within `convert`
; │││││┌ @ boot.jl:784 within `Int64`
; ││││││┌ @ boot.jl:711 within `toInt64`
         %1 = and i8 %0, 1
; ││└└└└└
; ││ @ promotion.jl:462 within `<` @ int.jl:83
    ret i8 %1
; └└
}

julia> code_llvm(x->x, (Bool,))
;  @ REPL[7]:1 within `#7`
; Function Attrs: uwtable
define i8 @"julia_#7_5660"(i8 zeroext %0) #0 {
top:
  ret i8 %0
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the compiler is being a bit silly here

julia> code_llvm(x -> x && true, (Bool,))
;  @ REPL[4]:1 within `#5`
define i8 @"julia_#5_249"(i8 zeroext %0) #0 {
top:
  %1 = and i8 %0, 1
  ret i8 %1
}

it doesn't realise could simply return the input argument.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's #21712

Copy link
Contributor

@mikmoore mikmoore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pending the compat version changes I suggested, this looks fine to me. Could use a news.md announcement.

base/number.jl Outdated Show resolved Hide resolved
base/number.jl Outdated Show resolved Hide resolved
putianyi889 and others added 2 commits March 21, 2024 17:16
Co-authored-by: mikmoore <[email protected]>
Co-authored-by: mikmoore <[email protected]>
base/number.jl Outdated
Comment on lines 159 to 160
ispositive(x) = x > zero(x) # generic fallback
ispositive(x::Real) = x > 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably missed it in the discussions above, but what would be the difference between x > 0 and x > zero(x)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not related to the discussions. I just realized

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the lowest real type on the promote totem pole is Bool, so I would suggest x > false as the generic case (although for most types, no promotion actually occurs). I agree that x > zero(x) doesn't add much and requires a type to explicitly define zero (which virtually all will, but it seems like an unnecessary dependency). For types requiring heap allocation (like BigInt, although it is covered by a separate specialization), zero will necessarily require an allocation whereas 0 or false might avoid it when mixed-type comparisons are defined.

In other words, BigInt is an example indicating that x > 0/x > false is likely a more performant fallback than x > zero(x).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was more wondering why having the two different methods to start with, sounds like we should a single good definition, since Real isn't much more specific than Any.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as BigInt and BigFloat have specialized methods anyway, I guess using zero(x) universally is fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that having a ::Any and ::Real definition is overkill. I'd drop the ::Real version and keep the ::Any, because some types that can behave like reals are not actually <: Real. DualNumbers.Dual <: Number being such an example.

I still like comparison with 0 or false just because it avoids cases where zero requires allocation or has side effects. But I don't see this causing issues very often so any of these would be fine.

Copy link
Contributor

@nsajko nsajko Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some types that can behave like reals are not actually <: Real. DualNumbers.Dual <: Number being such an example.

I don't think the dual numbers are ordered though? So I guess we don't want ispositive or isnegative defined for dual numbers. EDIT: I was wrong, they seem to be ordered.

@nsajko
Copy link
Contributor

nsajko commented Oct 21, 2024

Tests are failing:

numbers                                          (16) |         failed at 2024-10-21T07:52:16.381
Test Failed at /cache/build/tester-demeter6-13/julialang/julia-master/julia-761ed61ee4/share/julia/test/numbers.jl:844
  Expression: ispositive(0) === 0 > 0
   Evaluated: false === 0 > 0
Test Failed at /cache/build/tester-demeter6-13/julialang/julia-master/julia-761ed61ee4/share/julia/test/numbers.jl:845
  Expression: ispositive(-0) === -0 > 0
   Evaluated: false === 0 > 0
[...]

base/number.jl Outdated
false
```
"""
ispositive(x) = x > zero(x) # generic fallback
Copy link
Contributor

@nsajko nsajko Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👎

Why is there an Any method? The definition mostly makes sense for Real.

EDIT: it's for stuff like dates and unitful

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure there should be a method for Number, let alone Any.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is because of #35067

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case it should not belong to number.jl. If it has to be added, where should it go?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is because of #35067

I don't see the connection. That issue doesn't seem to discuss that part of the design. It barely discusses design at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That issue discussed reasons of having the general interface.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @nsajko. It's always possible to add more generic functionality later, but it's impossible to go backwards. The most generic zero gets is ::Number. If we have a function in base, then non-numeric types (like Dates) can of course choose to extend it if appropriate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I get it, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs docs Documentation for this change is required needs tests Unit tests are required for this change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants