Determine cause of hypothesis slow example generation in newer versions #32693

sshane · 2024-06-10T23:15:21Z

Related to #32536. This bounty is for determining why newer versions of hypothesis (6.103.1 vs our pinned 6.47) are so much slower at example generation for test_car_interfaces.py.

Bounty

Bounty is awarded for determining the exact cause and a solution for optimizing/fixing either the hypothesis library itself, or openpilot to improve the example generation times.

On AMD Ryzen Threadripper PRO 3955WX 16-Cores on hypothesis==6.47.0:

batman@workstation-shane:~/openpilot/selfdrive/car/tests$ pytest -n8 test_car_interfaces.py
/home/batman/.pyenv/versions/3.11.4/lib/python3.11/site-packages/pytest_benchmark/logger.py:46: PytestBenchmarkWarning: Benchmarks are automatically disabled because xdist plugin is active.Benchmarks cannot be performed reliably in a parallelized environment.
  warner(PytestBenchmarkWarning(text))
Test session starts (platform: linux, Python 3.11.4, pytest 8.2.1, pytest-sugar 1.0.0)
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
Using --randomly-seed=4244670443
rootdir: /home/batman/openpilot
configfile: pyproject.toml
plugins: timeout-2.3.1, xdist-3.6.1, cpp-2.5.0, cov-5.0.0, mock-3.14.0, forked-1.6.0, hypothesis-6.47.0, dash-2.11.1, benchmark-4.0.0, sugar-1.0.0, randomly-3.15.0, subtests-0.12.1, flaky-3.8.1, asyncio-0.23.7, anyio-4.4.0, nbmake-1.5.3, repeat-0.9.3
asyncio: mode=Mode.STRICT
8 workers [209 items]   collecting ... 

 selfdrive/car/tests/test_car_interfaces.py ✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓ 56% █████▋    
                                            ✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓                       100% ██████████
============================================================================ slowest 10 durations =============================================================================
1.16s call     selfdrive/car/tests/test_car_interfaces.py::TestCarInterfaces::test_car_interfaces_028_FORD_F_150_LIGHTNING_MK1
1.08s call     selfdrive/car/tests/test_car_interfaces.py::TestCarInterfaces::test_car_interfaces_030_FORD_MAVERICK_MK1
1.06s call     selfdrive/car/tests/test_car_interfaces.py::TestCarInterfaces::test_car_interfaces_024_FORD_BRONCO_SPORT_MK1
1.03s call     selfdrive/car/tests/test_car_interfaces.py::TestCarInterfaces::test_car_interfaces_025_FORD_ESCAPE_MK4
1.03s call     selfdrive/car/tests/test_car_interfaces.py::TestCarInterfaces::test_car_interfaces_031_FORD_MUSTANG_MACH_E_MK1
1.02s call     selfdrive/car/tests/test_car_interfaces.py::TestCarInterfaces::test_car_interfaces_029_FORD_F_150_MK14
0.99s call     selfdrive/car/tests/test_car_interfaces.py::TestCarInterfaces::test_car_interfaces_026_FORD_EXPLORER_MK6
0.98s call     selfdrive/car/tests/test_car_interfaces.py::TestCarInterfaces::test_car_interfaces_074_HYUNDAI_IONIQ_HEV_2022
0.96s call     selfdrive/car/tests/test_car_interfaces.py::TestCarInterfaces::test_car_interfaces_027_FORD_FOCUS_MK4
0.95s call     selfdrive/car/tests/test_car_interfaces.py::TestCarInterfaces::test_car_interfaces_151_SKODA_OCTAVIA_MK3

Results (22.66s):
     209 passed

and on latest hypothesis==6.103.1

batman@workstation-shane:~/openpilot/selfdrive/car/tests$ pytest -n8 test_car_interfaces.py
/home/batman/.pyenv/versions/3.11.4/lib/python3.11/site-packages/pytest_benchmark/logger.py:46: PytestBenchmarkWarning: Benchmarks are automatically disabled because xdist plugin is active.Benchmarks cannot be performed reliably in a parallelized environment.
  warner(PytestBenchmarkWarning(text))
Test session starts (platform: linux, Python 3.11.4, pytest 8.2.1, pytest-sugar 1.0.0)
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
Using --randomly-seed=264861645
rootdir: /home/batman/openpilot
configfile: pyproject.toml
plugins: timeout-2.3.1, xdist-3.6.1, cpp-2.5.0, cov-5.0.0, mock-3.14.0, forked-1.6.0, dash-2.11.1, benchmark-4.0.0, hypothesis-6.103.1, sugar-1.0.0, randomly-3.15.0, subtests-0.12.1, flaky-3.8.1, asyncio-0.23.7, anyio-4.4.0, nbmake-1.5.3, repeat-0.9.3
asyncio: mode=Mode.STRICT
8 workers [209 items]   collecting ... 

 selfdrive/car/tests/test_car_interfaces.py ✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓ 56% █████▋    
                                            ✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓                       100% ██████████
============================================================================ slowest 10 durations =============================================================================
1.70s call     selfdrive/car/tests/test_car_interfaces.py::TestCarInterfaces::test_car_interfaces_029_FORD_F_150_MK14
1.65s call     selfdrive/car/tests/test_car_interfaces.py::TestCarInterfaces::test_car_interfaces_027_FORD_FOCUS_MK4
1.62s call     selfdrive/car/tests/test_car_interfaces.py::TestCarInterfaces::test_car_interfaces_025_FORD_ESCAPE_MK4
1.61s call     selfdrive/car/tests/test_car_interfaces.py::TestCarInterfaces::test_car_interfaces_026_FORD_EXPLORER_MK6
1.56s call     selfdrive/car/tests/test_car_interfaces.py::TestCarInterfaces::test_car_interfaces_028_FORD_F_150_LIGHTNING_MK1
1.56s call     selfdrive/car/tests/test_car_interfaces.py::TestCarInterfaces::test_car_interfaces_108_KIA_NIRO_PHEV
1.51s call     selfdrive/car/tests/test_car_interfaces.py::TestCarInterfaces::test_car_interfaces_030_FORD_MAVERICK_MK1
1.51s call     selfdrive/car/tests/test_car_interfaces.py::TestCarInterfaces::test_car_interfaces_151_SKODA_OCTAVIA_MK3
1.50s call     selfdrive/car/tests/test_car_interfaces.py::TestCarInterfaces::test_car_interfaces_150_SKODA_KODIAQ_MK1
1.49s call     selfdrive/car/tests/test_car_interfaces.py::TestCarInterfaces::test_car_interfaces_149_SKODA_KAROQ_MK1

Results (36.67s):
     209 passed

The text was updated successfully, but these errors were encountered:

sshane · 2024-06-10T23:16:09Z

@deanlee @BBBmau in case you guys are interested!

bongbui321 · 2024-07-29T23:14:44Z

Narrowed this down to a specific commit that slows down the example generation by 30% in my test (comment).

Figured out a few other slowdowns

  # first bad commit: 5de1fe84252051594fdc6879d4920c357a6d1368 - more likely to generate boundary values
  # from 3.5 - 3.8s:  e66c88d99d61c0eca0d8aed59543d35e462fef89
  #     - better after revert: 6e2f394a253761677cdcc0990a32df54a62f079a
  # from 4s - >5s: 1e76ce2e52e450d54470ed09b9c65fb1b598fb5c - trackIRTree in ConjectureData

Most of the slowdowns are for better example generation (higher chance of generating boundary values, better repro of failing tests, faster shrinker,...).To resolve this, probably need some restructuring in how hypothesis track generated cases.

sshane added good first issue Feasible for new contributers CI / testing bounty labels Jun 10, 2024

github-project-automation bot added this to openpilot bounties Jun 10, 2024

github-project-automation bot moved this to Open in openpilot bounties Jun 10, 2024

adeebshihadeh changed the title ~~[$100 bounty] Determine cause of hypothesis slow example generation in newer versions~~ Determine cause of hypothesis slow example generation in newer versions Jul 7, 2024

adeebshihadeh mentioned this issue Jul 17, 2024

test_models: add tx fuzzy test #32897

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Determine cause of hypothesis slow example generation in newer versions #32693

Determine cause of hypothesis slow example generation in newer versions #32693

sshane commented Jun 10, 2024

sshane commented Jun 10, 2024 •

edited

Loading

bongbui321 commented Jul 29, 2024 •

edited

Loading

Determine cause of hypothesis slow example generation in newer versions #32693

Determine cause of hypothesis slow example generation in newer versions #32693

Comments

sshane commented Jun 10, 2024

Bounty

sshane commented Jun 10, 2024 • edited Loading

bongbui321 commented Jul 29, 2024 • edited Loading

sshane commented Jun 10, 2024 •

edited

Loading

bongbui321 commented Jul 29, 2024 •

edited

Loading