Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimized fast_array_util.py through vectorization #2911

Closed

Conversation

DarkMatterCompiler
Copy link

The optimized version vectorizes the cumulative trapezoidal integration over blocks, reducing nested loops and handling size normalization for each block efficiently, ensuring the output matches the expected shape.

closes issue #2910

🚦 Testing

How did you test these changes?
this notebook has the testing i performed
https://colab.research.google.com/drive/1HJ_4_YrRZ57ikMqC_oOsGq2dnn5LfuC_?usp=sharing

The optimized version vectorizes the cumulative trapezoidal integration over blocks, reducing nested loops and handling size normalization for each block efficiently, ensuring the output matches the expected shape.
@tardis-bot
Copy link
Contributor

*beep* *bop*

Hi, human.

I'm the @tardis-bot and couldn't find your records in my database. I think we don't know each other, or you changed your credentials recently.

Please add your name and email to .mailmap in your current branch and push the changes to this pull request.

In case you need to map an existing alias, follow this example.

@tardis-bot
Copy link
Contributor

tardis-bot commented Dec 22, 2024

*beep* *bop*
Hi human,
I ran ruff on the latest commit (a2e3967).
Here are the outputs produced.
Results can also be downloaded as artifacts here.
Summarised output:

5	E999	[ ] SyntaxError: Expected ',', found ':'
1	F401	[*] `numba.prange` imported but unused

Complete output(might be large):

.mailmap:1:38: E999 SyntaxError: Expected an expression
CHANGELOG.md:4:15: E999 SyntaxError: Expected ',', found ':'
CITATION.cff:3:1: E999 SyntaxError: Invalid annotated assignment target
README.rst:1:1: E999 SyntaxError: Expected a statement
docs/resources/credits.rst:1:1: E999 SyntaxError: Expected a statement
tardis/plasma/properties/continuum_processes/fast_array_util.py:4:25: F401 [*] `numba.prange` imported but unused
Found 6 errors.
[*] 1 fixable with the `--fix` option.

added my details
@DarkMatterCompiler DarkMatterCompiler changed the title Optimized fast_array_util.py through vectorization Optimized fast_array_util.py through vectorization and added details to mailmap Dec 22, 2024
@DarkMatterCompiler DarkMatterCompiler changed the title Optimized fast_array_util.py through vectorization and added details to mailmap Optimized fast_array_util.py through vectorization Dec 22, 2024
@tardis-bot
Copy link
Contributor

tardis-bot commented Dec 22, 2024

*beep* *bop*
Hi human,
I ran benchmarks as you asked comparing master (87e4ae1) and the latest commit (a2e3967).
Here are the logs produced by ASV.
Results can also be downloaded as artifacts here.

Significantly changed benchmarks:

All benchmarks:

Benchmarks that have stayed the same:

| Change   | Before [d73192c5] <master>   | After [a2e3967f]    | Ratio   | Benchmark (Parameter)                                                                                                               |
|----------|------------------------------|---------------------|---------|-------------------------------------------------------------------------------------------------------------------------------------|
|          | 6.13±0.8μs                   | 6.85±0.7μs          | ~1.12   | transport_montecarlo_vpacket.BenchmarkMontecarloMontecarloNumbaVpacket.time_trace_vpacket                                           |
|          | 1.97±0.01ms                  | 1.75±0.02ms         | ~0.89   | transport_montecarlo_main_loop.BenchmarkTransportMontecarloMontecarloMainLoop.time_montecarlo_main_loop                             |
|          | 55.8±30μs                    | 46.3±30μs           | ~0.83   | transport_montecarlo_interaction.BenchmarkTransportMontecarloInteraction.time_line_scatter                                          |
|          | 762±400ns                    | 531±100ns           | ~0.70   | opacities_opacity.BenchmarkMontecarloMontecarloNumbaOpacities.time_compton_opacity_calculation                                      |
|          | 38.9±10μs                    | 24.8±7μs            | ~0.64   | transport_montecarlo_packet_trackers.BenchmarkTransportMontecarloPacketTrackers.time_generate_rpacket_last_interaction_tracker_list |
|          | 571±200ns                    | 622±100ns           | 1.09    | opacities_opacity.BenchmarkMontecarloMontecarloNumbaOpacities.time_photoabsorption_opacity_calculation                              |
|          | 7.27±2μs                     | 7.96±2μs            | 1.09    | transport_montecarlo_vpacket.BenchmarkMontecarloMontecarloNumbaVpacket.time_trace_vpacket_volley                                    |
|          | 3.02±0.01ms                  | 3.18±0ms            | 1.05    | opacities_opacity_state.BenchmarkOpacitiesOpacityState.time_opacity_state_initialize('scatter')                                     |
|          | 38.9±0.2s                    | 39.2±0.04s          | 1.01    | run_tardis.BenchmarkRunTardis.time_run_tardis                                                                                       |
|          | 39.0±0.02μs                  | 39.3±0.01μs         | 1.01    | transport_montecarlo_packet_trackers.BenchmarkTransportMontecarloPacketTrackers.time_generate_rpacket_tracker_list                  |
|          | 1.04±0m                      | 1.04±0m             | 1.00    | run_tardis.BenchmarkRunTardis.time_run_tardis_rpacket_tracking                                                                      |
|          | 2.09±0m                      | 2.09±0m             | 1.00    | spectrum_formal_integral.BenchmarkTransportMontecarloFormalIntegral.time_FormalIntegrator_functions                                 |
|          | 1.18±0μs                     | 1.16±0μs            | 0.99    | transport_geometry_calculate_distances.BenchmarkTransportGeometryCalculateDistances.time_calculate_distance_boundary                |
|          | 2.41±2μs                     | 2.38±2μs            | 0.99    | transport_montecarlo_estimators_radfield_estimator_calcs.BenchmarkMontecarloMontecarloNumbaPacket.time_update_line_estimators       |
|          | 51.4±20μs                    | 51.0±30μs           | 0.99    | transport_montecarlo_interaction.BenchmarkTransportMontecarloInteraction.time_line_emission                                         |
|          | 2.72±0.4ms                   | 2.68±0.4ms          | 0.99    | transport_montecarlo_single_packet_loop.BenchmarkTransportMontecarloSinglePacketLoop.time_single_packet_loop                        |
|          | 3.17±0.6μs                   | 3.14±0.7μs          | 0.99    | transport_montecarlo_vpacket.BenchmarkMontecarloMontecarloNumbaVpacket.time_trace_bad_vpacket                                       |
|          | 738±40ns                     | 725±2ns             | 0.98    | transport_montecarlo_interaction.BenchmarkTransportMontecarloInteraction.time_thomson_scatter                                       |
|          | 4.06±0.02ms                  | 3.92±0.03ms         | 0.97    | opacities_opacity_state.BenchmarkOpacitiesOpacityState.time_opacity_state_initialize('macroatom')                                   |
|          | 212±0.02ns                   | 206±0.1ns           | 0.97    | spectrum_formal_integral.BenchmarkTransportMontecarloFormalIntegral.time_intensity_black_body                                       |
|          | 3.51±0.3μs                   | 3.37±0.4μs          | 0.96    | transport_montecarlo_vpacket.BenchmarkMontecarloMontecarloNumbaVpacket.time_trace_vpacket_within_shell                              |
|          | 571±100ns                    | 541±100ns           | 0.95    | opacities_opacity.BenchmarkMontecarloMontecarloNumbaOpacities.time_pair_creation_opacity_calculation                                |
|          | 1.62±0.3μs                   | 1.52±0.4μs          | 0.94    | transport_geometry_calculate_distances.BenchmarkTransportGeometryCalculateDistances.time_calculate_distance_line                    |
|          | 70.2±3ms                     | 64.5±0.09ms         | 0.92    | transport_montecarlo_packet_trackers.BenchmarkTransportMontecarloPacketTrackers.time_rpacket_trackers_to_dataframe                  |

If you want to see the graph of the results, you can check it here

@DarkMatterCompiler
Copy link
Author

DarkMatterCompiler commented Dec 22, 2024

Optimized fast_array_util.py
the optimized version vectorizes the cumulative trapezoidal integration over blocks #2911 #2757

fixed the typos flagged by codespell
@andrewfullard
Copy link
Contributor

Thanks for your contribution, we will consider this solution as an option. For now we will close this PR.

@wkerzendorf
Copy link
Member

@DarkMatterCompiler could you please add your notebook to the original issue. That is quite useful!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants