Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate (& report?) performances on JIT runtimes #207

Open
masklinn opened this issue Mar 27, 2024 · 1 comment
Open

Investigate (& report?) performances on JIT runtimes #207

masklinn opened this issue Mar 27, 2024 · 1 comment

Comments

@masklinn
Copy link
Contributor

masklinn commented Mar 27, 2024

@masklinn
Copy link
Contributor Author

masklinn commented Oct 14, 2024

All benching done with samples/useragents.txt (75158 lines, 20322 unique).

NOTE: "legacy" has a clearing cache of size 200.

parser cache (n=200) cpython 3.12 pypy 7.3.17 graalpy 24.1.0
legacy clearing 29.00s (386us/line) 156.56s (2083us/line) 106.12s (1412us/line)
basic 33.53s (446us/line) 221.14s (2942us/line) 117.09s (1558us/line)
basic lru 29.19s (388us/line) 220.40s (2933us/line) 72.96s (971us/line)
basic s3fifo 24.18s (322us/line) 146.93s (1955us/line) 55.65s (740us/line)
basic sieve 24.37s (324us/line) 127.61s (1698us/line) 55.43s (737us/line)
regex 1.31s (17us/line) 1.47s (20us/line) 7.15s (95us/line)

A few observations:

  • the regex engines of pypy and graal are dreary, but graal's showing is much better than expected1
  • pypy really doesn't like it the new impl: on cpython the LRU is sufficient to catch up2 and on Graal to gain a 30% edge, but pypy is still 40% behind
  • pypy also much prefers sieve, graal and cpython essentially don't care

Footnotes

  1. important caveat, somehow Graal manages to work with a ton of concurrency, I assume the GC is concurrent but I don't understand what else it's doing: on my machine, on bench --bases legacy basic regex --cachesizes 200 --caches none lru s3fifo sieve, GraalPy times to 1175.69s user 8.45s system 283% cpu 6:57.58 total while pypy times to 925.28s user 3.32s system 100% cpu 15:27.14 total, so graal uses nearly 30% more CPU total but it nearly fully loads 3 cores, so ends up executing the test suite a bit more than twice as fast (some configurations go even higher e.g. basic with no cache basically runs at 400%)

  2. LRU should be better than the clearing cache at 200, but it makes sense that the layered approach of the new API would have some additional overhead, so coming out to a wash on cpython makes sense

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant