-
-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimized fast_array_util.py through vectorization #2911
Conversation
The optimized version vectorizes the cumulative trapezoidal integration over blocks, reducing nested loops and handling size normalization for each block efficiently, ensuring the output matches the expected shape.
*beep* *bop* Hi, human. I'm the @tardis-bot and couldn't find your records in my database. I think we don't know each other, or you changed your credentials recently. Please add your name and email to In case you need to map an existing alias, follow this example. |
*beep* *bop* 5 E999 [ ] SyntaxError: Expected ',', found ':'
1 F401 [*] `numba.prange` imported but unused
Complete output(might be large): .mailmap:1:38: E999 SyntaxError: Expected an expression
CHANGELOG.md:4:15: E999 SyntaxError: Expected ',', found ':'
CITATION.cff:3:1: E999 SyntaxError: Invalid annotated assignment target
README.rst:1:1: E999 SyntaxError: Expected a statement
docs/resources/credits.rst:1:1: E999 SyntaxError: Expected a statement
tardis/plasma/properties/continuum_processes/fast_array_util.py:4:25: F401 [*] `numba.prange` imported but unused
Found 6 errors.
[*] 1 fixable with the `--fix` option.
|
added my details
*beep* *bop* Significantly changed benchmarks: All benchmarks: Benchmarks that have stayed the same:
| Change | Before [d73192c5] <master> | After [a2e3967f] | Ratio | Benchmark (Parameter) |
|----------|------------------------------|---------------------|---------|-------------------------------------------------------------------------------------------------------------------------------------|
| | 6.13±0.8μs | 6.85±0.7μs | ~1.12 | transport_montecarlo_vpacket.BenchmarkMontecarloMontecarloNumbaVpacket.time_trace_vpacket |
| | 1.97±0.01ms | 1.75±0.02ms | ~0.89 | transport_montecarlo_main_loop.BenchmarkTransportMontecarloMontecarloMainLoop.time_montecarlo_main_loop |
| | 55.8±30μs | 46.3±30μs | ~0.83 | transport_montecarlo_interaction.BenchmarkTransportMontecarloInteraction.time_line_scatter |
| | 762±400ns | 531±100ns | ~0.70 | opacities_opacity.BenchmarkMontecarloMontecarloNumbaOpacities.time_compton_opacity_calculation |
| | 38.9±10μs | 24.8±7μs | ~0.64 | transport_montecarlo_packet_trackers.BenchmarkTransportMontecarloPacketTrackers.time_generate_rpacket_last_interaction_tracker_list |
| | 571±200ns | 622±100ns | 1.09 | opacities_opacity.BenchmarkMontecarloMontecarloNumbaOpacities.time_photoabsorption_opacity_calculation |
| | 7.27±2μs | 7.96±2μs | 1.09 | transport_montecarlo_vpacket.BenchmarkMontecarloMontecarloNumbaVpacket.time_trace_vpacket_volley |
| | 3.02±0.01ms | 3.18±0ms | 1.05 | opacities_opacity_state.BenchmarkOpacitiesOpacityState.time_opacity_state_initialize('scatter') |
| | 38.9±0.2s | 39.2±0.04s | 1.01 | run_tardis.BenchmarkRunTardis.time_run_tardis |
| | 39.0±0.02μs | 39.3±0.01μs | 1.01 | transport_montecarlo_packet_trackers.BenchmarkTransportMontecarloPacketTrackers.time_generate_rpacket_tracker_list |
| | 1.04±0m | 1.04±0m | 1.00 | run_tardis.BenchmarkRunTardis.time_run_tardis_rpacket_tracking |
| | 2.09±0m | 2.09±0m | 1.00 | spectrum_formal_integral.BenchmarkTransportMontecarloFormalIntegral.time_FormalIntegrator_functions |
| | 1.18±0μs | 1.16±0μs | 0.99 | transport_geometry_calculate_distances.BenchmarkTransportGeometryCalculateDistances.time_calculate_distance_boundary |
| | 2.41±2μs | 2.38±2μs | 0.99 | transport_montecarlo_estimators_radfield_estimator_calcs.BenchmarkMontecarloMontecarloNumbaPacket.time_update_line_estimators |
| | 51.4±20μs | 51.0±30μs | 0.99 | transport_montecarlo_interaction.BenchmarkTransportMontecarloInteraction.time_line_emission |
| | 2.72±0.4ms | 2.68±0.4ms | 0.99 | transport_montecarlo_single_packet_loop.BenchmarkTransportMontecarloSinglePacketLoop.time_single_packet_loop |
| | 3.17±0.6μs | 3.14±0.7μs | 0.99 | transport_montecarlo_vpacket.BenchmarkMontecarloMontecarloNumbaVpacket.time_trace_bad_vpacket |
| | 738±40ns | 725±2ns | 0.98 | transport_montecarlo_interaction.BenchmarkTransportMontecarloInteraction.time_thomson_scatter |
| | 4.06±0.02ms | 3.92±0.03ms | 0.97 | opacities_opacity_state.BenchmarkOpacitiesOpacityState.time_opacity_state_initialize('macroatom') |
| | 212±0.02ns | 206±0.1ns | 0.97 | spectrum_formal_integral.BenchmarkTransportMontecarloFormalIntegral.time_intensity_black_body |
| | 3.51±0.3μs | 3.37±0.4μs | 0.96 | transport_montecarlo_vpacket.BenchmarkMontecarloMontecarloNumbaVpacket.time_trace_vpacket_within_shell |
| | 571±100ns | 541±100ns | 0.95 | opacities_opacity.BenchmarkMontecarloMontecarloNumbaOpacities.time_pair_creation_opacity_calculation |
| | 1.62±0.3μs | 1.52±0.4μs | 0.94 | transport_geometry_calculate_distances.BenchmarkTransportGeometryCalculateDistances.time_calculate_distance_line |
| | 70.2±3ms | 64.5±0.09ms | 0.92 | transport_montecarlo_packet_trackers.BenchmarkTransportMontecarloPacketTrackers.time_rpacket_trackers_to_dataframe |
If you want to see the graph of the results, you can check it here |
fixed the typos flagged by codespell
Thanks for your contribution, we will consider this solution as an option. For now we will close this PR. |
@DarkMatterCompiler could you please add your notebook to the original issue. That is quite useful! |
The optimized version vectorizes the cumulative trapezoidal integration over blocks, reducing nested loops and handling size normalization for each block efficiently, ensuring the output matches the expected shape.
closes issue #2910
🚦 Testing
How did you test these changes?
this notebook has the testing i performed
https://colab.research.google.com/drive/1HJ_4_YrRZ57ikMqC_oOsGq2dnn5LfuC_?usp=sharing