Inexplicable arithmetic overflow ?? #15341

jzakiya · 2025-01-13T19:06:22Z

I'm encountering a seemingly random arithmetic overflow error message in code that shouldn't overflow.

Below is the code.

def prime_pairs(n, flag = false)
  return puts "Input not even n > 2" unless n.even? && n > 2
  return (pp [n, 1]; (pp [n//2, n//2] if flag)) if n <= 6

  # generate the low-half-residues (lhr) r < n/2
  lhr = 3u64.step(to: n//2, by: 2).select { |r| r if r.gcd(n) == 1 }.to_a
  ndiv2, rhi = n//2, n-2           # set process constants
  lhr_mults = [] of typeof(n)      # for lhr values not part of a pcp

  # store all the powers of the lhr members < n-2
  lhr.each do |r|                  # step thru the lhr members
    r_pwr = r                      # set initial power of r
    break if r > rhi // r_pwr      # exit if r^2 > n-2, as all others are too
    while    r < rhi // r_pwr      # while r^e < n-2
      lhr_mults << (r_pwr *= r)    # store its current power of r
    end
  end

  # store all the cross-products of the lhr members < n-2
  lhr_dup = lhr.dup                # make copy of the lhr members list
  while (r = lhr_dup.shift) && !lhr_dup.empty? # do mults of 1st list r w/others
    ri_max = rhi // r              # ri can't multiply r with values > this
    break if lhr_dup[0] > ri_max   # exit if product of consecutive r’s > n-2
    lhr_dup.each do |ri|           # for each residue in reduced list
      puts "r = #{r}, ri = #{ri}, ri_max = #{ri_max}"
      break if ri > ri_max         # exit for r if cross-product with ri > n-2
      lhr_mults << r * ri          # store value if < n-2
    end                            # check cross-products of next lhr  member
  end

  # convert lhr_mults vals > n/2 to their lhr complements n-r,
  # store them, those < n/2, in lhr_del; it now holds non-pcp lhr vals
  lhr_del = lhr_mults.map { |r_del| r_del > ndiv2 ? n - r_del : r_del }

  pp [n, (lhr.size - lhr_del.uniq.size)]                      # show pcp count
  (print (lhr -= lhr_del).map { |p| [p, n-p] }; puts) if flag # show pcp values
end

def tm; t = Time.monotonic; yield; Time.monotonic - t end     # time code execution

def gen_pcp
  n = (ARGV[0].to_u64 underscore: true)
  flag = ARGV.size > 1 ? true : false
  puts tm { prime_pairs(n, flag)}
end

gen_pcp

I compile it like this.

$ crystal build --release --mcpu native prime_pairs.cr

Run it like this,

$  ./prime_pairs 123_456

The overflow message occurs in this part.

  # store all the cross-products of the lhr members < n-2
  lhr_dup = lhr.dup                # make copy of the lhr members list
  while (r = lhr_dup.shift) && !lhr_dup.empty? # do mults of 1st list r w/others
    ri_max = rhi // r              # ri can't multiply r with values > this
    break if lhr_dup[0] > ri_max   # exit if product of consecutive r’s > n-2
    lhr_dup.each do |ri|           # for each residue in reduced list
      puts "r = #{r}, ri = #{ri}, ri_max = #{ri_max}"
      break if ri > ri_max         # exit for r if cross-product with ri > n-2
      lhr_mults << r * ri          # store value if < n-2
    end                            # check cross-products of next lhr  member
  end

At no time does the math create values > n-2.

Here is truncated output with the print statement for n = 150_000_006 that doesn't overflow.

r = 12241, ri = 12247, ri_max = 12253
r = 12241, ri = 12251, ri_max = 12253
r = 12241, ri = 12253, ri_max = 12253
r = 12241, ri = 12257, ri_max = 12253
r = 12245, ri = 12247, ri_max = 12249
r = 12245, ri = 12251, ri_max = 12249
[150000006, 708844]
00:03:16.280705215

This executes correctly for n = 150_000_006, when ri > ri_max at the end.
You see, ri_max is continually getting smaller.

This overflows for a smaller value of n = 148_000_006, when ri > ri_max at the end.
All I can think of is > is somehow the culprit?

r = 12161, ri = 12167, ri_max = 12170
r = 12161, ri = 12169, ri_max = 12170
r = 12161, ri = 12171, ri_max = 12170
r = 12163, ri = 12165, ri_max = 12168
r = 12163, ri = 12167, ri_max = 12168
r = 12163, ri = 12169, ri_max = 12168
Unhandled exception: Arithmetic overflow (OverflowError)
  from ./prime_pairs1 in '??'
  from ./prime_pairs1 in '??'
  from ./prime_pairs1 in '??'
  from ./prime_pairs1 in '__crystal_main'
  from ./prime_pairs1 in '??'
  from ./prime_pairs1 in 'main'
  from /lib64/libc.so.6 in '??'
  from /lib64/libc.so.6 in '__libc_start_main'
  from ./prime_pairs1 in '_start'
  from ???
➜  ~

For some reason, it bombs for n= 148_000_006 when ri = 12169 > ri_max = 12168

Both these values are way, way less than 148_000_006, so how can there be an overflow?
And it occurs with apparently random values of different sizes (larger|smaller).

The problem occurs using Crystal versions 1.15.0, 1.14.0, 1.13.2 and 1.12.2.

The Ruby version, which is almost code identical, works for all values with no problems.

The text was updated successfully, but these errors were encountered:

straight-shoota · 2025-01-13T19:21:11Z

Thanks for this bug report. Unfortunately, it's not clear how to reproduce this issue.
Could you please provide exact instructions with code, build command and runtime arguments for how to reproduce the error.

Also it would help if you could try to reduce the code (i.e. throw away stuff that isn't necessary) and add some instrumentation in to figure out where exactly the exception is thrown. Enabling --debug information could also help for pinpointing the origin.

Blacksmoke16 · 2025-01-13T19:32:39Z

It's not super clear, but you can reproduce this by:

crystal build --release --mcpu native prime_pairs.cr
./prime_pairs 148_000_006

Which results in the error within a few minutes. I ran it with a locally built compiler, which resulted in this trace:

Unhandled exception: Arithmetic overflow (OverflowError)
  from /Users/george/crystal/src/hash.cr:833:10 in 'indices_malloc_size'
  from /Users/george/crystal/src/hash.cr:828:5 in 'malloc_indices'
  from /Users/george/crystal/src/hash.cr:252:9 in 'initialize:initial_capacity'
  from /Users/george/crystal/src/hash.cr:225:3 in 'new:initial_capacity'
  from /Users/george/crystal/src/set.cr:37:5 in 'initialize'
  from /Users/george/crystal/src/set.cr:36:3 in 'new'
  from /Users/george/crystal/src/set.cr:45:5 in 'new'
  from /Users/george/crystal/src/set.cr:509:5 in 'to_set'
  from /Users/george/crystal/src/array.cr:1908:5 in 'uniq'
  from prime_pairs.cr:35:22 in 'prime_pairs'
  from prime_pairs.cr:40:30 in 'gen_pcp'
  from prime_pairs.cr:49:1 in '__crystal_main'
  from /Users/george/crystal/src/crystal/main.cr:118:5 in 'main_user_code'
  from /Users/george/crystal/src/crystal/main.cr:104:7 in 'main'
  from /Users/george/crystal/src/crystal/system/unix/main.cr:9:3 in 'main'

So seems to actually be a limitation of Hash?

jzakiya · 2025-01-13T19:47:15Z

Here is a shortened version that runs just the problem section

def test_code(n)
  # generate the low-half-residues (lhr) r < n/2
  lhr = 3u64.step(to: n//2, by: 2).select { |r| r if r.gcd(n) == 1 }.to_a

  # store all the cross-products of the lhr members < n-2
  lhr_dup = lhr.dup                # make copy of the lhr members list
  while (r = lhr_dup.shift) && !lhr_dup.empty? # do mults of 1st list r w/others
    ri_max = (n-2) // r            # ri can't multiply r with values > this
    break if lhr_dup[0] > ri_max   # exit if product of consecutive r’s > n-2
    lhr_dup.each do |ri|           # for each residue in reduced list
      puts "r = #{r}, ri = #{ri}, ri_max = #{ri_max}"
      break if ri > ri_max         # exit for r if cross-product with ri > n-2
    end                            # check cross-products of next lhr  member
  end
end

def tm; t = Time.monotonic; yield; Time.monotonic - t end     # time code execution

def gen_test
  n = (ARGV[0].to_u64 underscore: true)
  puts tm { test_code(n) }
end

gen_test

Sample output.

➜  ~ ./test_code 100                                      
r = 3, ri = 7, ri_max = 32
r = 3, ri = 9, ri_max = 32
r = 3, ri = 11, ri_max = 32
r = 3, ri = 13, ri_max = 32
r = 3, ri = 17, ri_max = 32
r = 3, ri = 19, ri_max = 32
r = 3, ri = 21, ri_max = 32
r = 3, ri = 23, ri_max = 32
r = 3, ri = 27, ri_max = 32
r = 3, ri = 29, ri_max = 32
r = 3, ri = 31, ri_max = 32
r = 3, ri = 33, ri_max = 32
r = 7, ri = 9, ri_max = 14
r = 7, ri = 11, ri_max = 14
r = 7, ri = 13, ri_max = 14
r = 7, ri = 17, ri_max = 14
00:00:00.000062056

straight-shoota · 2025-01-13T20:01:24Z

So looks like lhr_del is so big that allocating a Hash to hold its items fails. I'm sure the error message should be improved to clearly point out the issue.

jzakiya · 2025-01-13T21:00:00Z

Yes, it wasn't an arithmetic overflow, which makes me feel sane again.

I ran test_code with n = 148_000_006 and it ran and completed (like it should have).

Here's truncated test data for it to show it completed without a problem.

r = 12161, ri = 12165, ri_max = 12170
r = 12161, ri = 12167, ri_max = 12170
r = 12161, ri = 12169, ri_max = 12170
r = 12161, ri = 12171, ri_max = 12170
r = 12163, ri = 12165, ri_max = 12168
r = 12163, ri = 12167, ri_max = 12168
r = 12163, ri = 12169, ri_max = 12168
00:07:27.839411344

jzakiya · 2025-01-14T00:08:17Z

FYI
My system is Lenovo Legion 5 Slim laptop, 16GB memory, running Linux.

I'm able to run the Ruby version for n = 148_000_006 for Ruby 3.4.1|3.3.6, JRuby 9.4.9.0 (Truffleruby 24.1.1 gives out of mem ).

Ruby 3.4.1 (Dec 25, 2025 release)

➜  ~ ruby prime_pairs.rb 148_000_006

[148000006, 436612]
42.623468019


Ruby 3.3.6

➜  ~ ruby prime_pairs.rb 148_000_006 

[148000006, 436612]
45.15575637

JRuby 9.4.9.0

➜  ~ ruby -J-Xmx14000M prime_pairs.rb 148_000_006 

[148000006, 436612]
108.151445

So 16GB was enough for them to run this in.

Here's the almost identical Ruby code.

def prime_pairs(n, flag = false)
  return puts "Input not even n > 2" unless n.even? && n > 2
  return (pp [n, 1]; (pp [n/2, n/2] if flag)) if n <= 6

  # generate the low-half-residues (lhr) r < n/2
  lhr = 3.step(n/2, 2).select { |r| r if r.gcd(n) == 1 }
  ndiv2, rhi = n/2, n-2            # set process constants
  lhr_mults = []                   # for lhr values not part of a pcp

  # store all the powers of the lhr members < n-2
  lhr.each do |r|                  # step thru the lhr members
    r_pwr = r                      # set to first power of r
    break if r > rhi / r_pwr       # exit if r^2 > n-2, as all others are too
    while    r < rhi / r_pwr       # while r^e < n-2
      lhr_mults << (r_pwr *= r)    # store its current power of r
    end
  end

  # store all the cross-products of the lhr members < n-2
  lhr_dup = lhr.dup                # make copy of the lhr members list
  while (r = lhr_dup.shift) && !lhr_dup.empty? # do mults of 1st list r w/others
    ri_max = rhi / r               # ri can't multiply r with values > this
    break if lhr_dup[0] > ri_max   # exit if product of consecutive r’s > n-2
    lhr_dup.each do |ri|           # for each residue in reduced list
      break if ri > ri_max         # exit for r if cross-product with ri > n-2
      lhr_mults << r * ri          # store value if < n-2
    end                            # check cross-products of next lhr member
  end

  # convert lhr_mults vals > n/2 to their lhr complements n-r,
  # store them, those < n/2, in lhr_del; it now holds non-pcp lhr vals
  lhr_del = lhr_mults.map { |r_del| r_del > ndiv2 ? n - r_del : r_del }

  pp [n, (lhr.size - lhr_del.uniq.size)]                      # show pcp count
  (print (lhr -= lhr_del).map { |p| [p, n-p] }; puts) if flag # show pcp valuess
end

def tm; t = Time.now; yield; Time.now - t end                 # time code execution

n = ARGV[0].to_i
flag = ARGV.size > 1 ? true : false
puts tm { prime_pairs(n, flag) }   # show execution runtime as last output

I would think if there's enough memory for Ruby to run and completer then so should Crystal.

straight-shoota · 2025-01-14T09:46:13Z

The exception raises on the multiplication 536870912 * 4, which would be 2147483648. But that's Int32::MAX + 1, hence it doesn't fit into the default number type.
So the underlying issue is that collection types in Crystal are limited to the size of Int32 (cf. #4011).

This works fine in Ruby because Ruby's Integer type uses arbitrary precision, similar to BigInt in Crystal (and potentially, the Hash implementation works differently).

ysbaddaden · 2025-01-14T10:05:12Z

More specifically: you can only address as many as Int32::MAX entries in a stdlib collection in Crystal (be it Slice, Array, Hash, ...).

You can allocate more than Int32::MAX bytes in a collection but the maximum contiguous memory will always be Int32::MAX * sizeof(T).

To overcome this limitation, you need specific types (or wrapper types).

straight-shoota · 2025-01-14T10:44:55Z

Hm, actually it seems we can enable this by performing the multiplication in indices_malloc_size in a larger type:

@@ -831,5 +831,5 @@ class Hash(K, V)
   # The actual number of bytes needed to allocate `@indices`.
   private def indices_malloc_size(size)
-    size * @indices_bytesize
+    size.to_u32 * @indices_bytesize
    end

This works because @indices is not a collection type. It's just a pointer and uses pointer arithmetics directly. The actual number of collection entries (536870912) is way below Int32::MAX.

ysbaddaden · 2025-01-14T10:48:44Z

Oh right. I guess we could even go for size.to_u64 since it's the byte allocation size 🤔

crysbot · 2025-01-14T18:29:46Z

This issue has been mentioned on Crystal Forum. There might be relevant details there:

https://forum.crystal-lang.org/t/goldbachs-conjecture-and-crystal-code/7579/18

fixes crystal-lang#15341 Co-authored-by: Johannes Müller <[email protected]>

jzakiya added the kind:bug A bug in the code. Does not apply to documentation, specs, etc. label Jan 13, 2025

straight-shoota added the status:needs-more-info label Jan 13, 2025

Blacksmoke16 added topic:stdlib:collection and removed status:needs-more-info labels Jan 13, 2025

straight-shoota added kind:feature and removed kind:bug A bug in the code. Does not apply to documentation, specs, etc. labels Jan 13, 2025

ysbaddaden added a commit to ysbaddaden/crystal that referenced this issue Jan 15, 2025

Fix: Hash @indices can grow larger than Int32::MAX bytes

998ba8a

fixes crystal-lang#15341 Co-authored-by: Johannes Müller <[email protected]>

ysbaddaden linked a pull request Jan 15, 2025 that will close this issue

Fix: Hash @indices can grow larger than Int32::MAX bytes #15347

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inexplicable arithmetic overflow ?? #15341

Inexplicable arithmetic overflow ?? #15341

jzakiya commented Jan 13, 2025 •

edited

Loading

straight-shoota commented Jan 13, 2025

Blacksmoke16 commented Jan 13, 2025

jzakiya commented Jan 13, 2025

straight-shoota commented Jan 13, 2025

jzakiya commented Jan 13, 2025

jzakiya commented Jan 14, 2025

straight-shoota commented Jan 14, 2025 •

edited

Loading

ysbaddaden commented Jan 14, 2025

straight-shoota commented Jan 14, 2025 •

edited

Loading

ysbaddaden commented Jan 14, 2025

crysbot commented Jan 14, 2025

Inexplicable arithmetic overflow ?? #15341

Inexplicable arithmetic overflow ?? #15341

Comments

jzakiya commented Jan 13, 2025 • edited Loading

straight-shoota commented Jan 13, 2025

Blacksmoke16 commented Jan 13, 2025

jzakiya commented Jan 13, 2025

straight-shoota commented Jan 13, 2025

jzakiya commented Jan 13, 2025

jzakiya commented Jan 14, 2025

straight-shoota commented Jan 14, 2025 • edited Loading

ysbaddaden commented Jan 14, 2025

straight-shoota commented Jan 14, 2025 • edited Loading

ysbaddaden commented Jan 14, 2025

crysbot commented Jan 14, 2025

jzakiya commented Jan 13, 2025 •

edited

Loading

straight-shoota commented Jan 14, 2025 •

edited

Loading

straight-shoota commented Jan 14, 2025 •

edited

Loading