Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] composite_lp_aggregate to handle log-probs aggregates globally #1181

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jan 13, 2025

I propose a global flag composite_lp_aggregate to handle the issue of the aggregation of log-probs.

So far, we have dealt with this using kwargs everywhere (in ProbabilisticTDModule, ProbabilisticTDSequential, CompositeDistribution and subclasses). The hierarchy of these classes and what to do when args conflict isn't easy to handle. It's also confusing for users, who I suspect will usually want to work with either collapsed or non-collapsed log-probs.

A global flag (set to True for now and False in the future) will make things easier to handle.
Globally, the v0.6.2 behaviour will not be changed all users who rely on it will be informed about the upcoming change through a warning that will tell them to set the global var to False to accommodate upcoming changes. If they set it to True, nothing will change for them but that also means that bugs will not be solved (we won't maintain the True behaviour).

When composite_lp_aggregate() == True, we'll have aggregate_probabilities=True, include_sum=True and inplace=True by default everywhere. When composite_lp_aggregate() == False, all of these will be set to False, meaning that any call to whatever.log_prob(tensordict) will return another tensordict containing the leaf log-probs.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 13, 2025
@vmoens vmoens added Refactor Refactoring code - not a new feature Deprecation Announces or enacts a deprecation labels Jan 13, 2025
Copy link

github-actions bot commented Jan 14, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}15$. Worsened: $\large\color{#d91a1a}27$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 49.8230μs 21.2783μs 46.9963 KOps/s 51.9469 KOps/s $\textbf{\color{#d91a1a}-9.53\%}$
test_plain_set_stack_nested 55.7540μs 21.5773μs 46.3451 KOps/s 52.2627 KOps/s $\textbf{\color{#d91a1a}-11.32\%}$
test_plain_set_nested_inplace 59.9320μs 22.8812μs 43.7040 KOps/s 47.3951 KOps/s $\textbf{\color{#d91a1a}-7.79\%}$
test_plain_set_stack_nested_inplace 72.8360μs 22.9579μs 43.5579 KOps/s 47.8776 KOps/s $\textbf{\color{#d91a1a}-9.02\%}$
test_items 27.1710μs 4.2007μs 238.0575 KOps/s 237.3068 KOps/s $\color{#35bf28}+0.32\%$
test_items_nested 0.7366ms 0.4132ms 2.4202 KOps/s 2.5275 KOps/s $\color{#d91a1a}-4.25\%$
test_items_nested_locked 0.7156ms 0.4093ms 2.4431 KOps/s 2.4716 KOps/s $\color{#d91a1a}-1.15\%$
test_items_nested_leaf 0.1516ms 78.7568μs 12.6973 KOps/s 12.8632 KOps/s $\color{#d91a1a}-1.29\%$
test_items_stack_nested 0.5882ms 0.4100ms 2.4391 KOps/s 2.5033 KOps/s $\color{#d91a1a}-2.57\%$
test_items_stack_nested_leaf 0.1521ms 78.8587μs 12.6809 KOps/s 12.7487 KOps/s $\color{#d91a1a}-0.53\%$
test_items_stack_nested_locked 0.6345ms 0.4116ms 2.4294 KOps/s 2.4874 KOps/s $\color{#d91a1a}-2.33\%$
test_keys 22.6020μs 3.4875μs 286.7374 KOps/s 288.6082 KOps/s $\color{#d91a1a}-0.65\%$
test_keys_nested 0.3129ms 0.1625ms 6.1550 KOps/s 6.0063 KOps/s $\color{#35bf28}+2.48\%$
test_keys_nested_locked 0.6861ms 0.1682ms 5.9470 KOps/s 5.8588 KOps/s $\color{#35bf28}+1.51\%$
test_keys_nested_leaf 0.2305ms 0.1423ms 7.0288 KOps/s 6.9533 KOps/s $\color{#35bf28}+1.09\%$
test_keys_stack_nested 0.2879ms 0.1634ms 6.1214 KOps/s 6.0092 KOps/s $\color{#35bf28}+1.87\%$
test_keys_stack_nested_leaf 0.2216ms 0.1418ms 7.0526 KOps/s 6.9686 KOps/s $\color{#35bf28}+1.21\%$
test_keys_stack_nested_locked 0.3064ms 0.1681ms 5.9490 KOps/s 5.8710 KOps/s $\color{#35bf28}+1.33\%$
test_values 5.5084μs 1.0291μs 971.7639 KOps/s 965.5300 KOps/s $\color{#35bf28}+0.65\%$
test_values_nested 0.1142ms 61.5524μs 16.2463 KOps/s 15.9085 KOps/s $\color{#35bf28}+2.12\%$
test_values_nested_locked 0.1125ms 61.5663μs 16.2426 KOps/s 15.9646 KOps/s $\color{#35bf28}+1.74\%$
test_values_nested_leaf 0.1242ms 70.3801μs 14.2086 KOps/s 12.8642 KOps/s $\textbf{\color{#35bf28}+10.45\%}$
test_values_stack_nested 0.1481ms 61.2655μs 16.3224 KOps/s 15.6874 KOps/s $\color{#35bf28}+4.05\%$
test_values_stack_nested_leaf 0.1023ms 69.9535μs 14.2952 KOps/s 14.0240 KOps/s $\color{#35bf28}+1.93\%$
test_values_stack_nested_locked 0.1317ms 61.4307μs 16.2785 KOps/s 15.8429 KOps/s $\color{#35bf28}+2.75\%$
test_membership 3.4636μs 0.7184μs 1.3920 MOps/s 1.3906 MOps/s $\color{#35bf28}+0.10\%$
test_membership_nested 23.6140μs 2.9320μs 341.0641 KOps/s 342.5849 KOps/s $\color{#d91a1a}-0.44\%$
test_membership_nested_leaf 21.5500μs 2.9650μs 337.2717 KOps/s 339.7894 KOps/s $\color{#d91a1a}-0.74\%$
test_membership_stacked_nested 21.1900μs 2.9376μs 340.4130 KOps/s 345.8762 KOps/s $\color{#d91a1a}-1.58\%$
test_membership_stacked_nested_leaf 29.1240μs 2.9708μs 336.6122 KOps/s 344.7126 KOps/s $\color{#d91a1a}-2.35\%$
test_membership_nested_last 23.7850μs 4.4172μs 226.3861 KOps/s 229.2153 KOps/s $\color{#d91a1a}-1.23\%$
test_membership_nested_leaf_last 27.5810μs 4.4555μs 224.4439 KOps/s 227.9879 KOps/s $\color{#d91a1a}-1.55\%$
test_membership_stacked_nested_last 27.7320μs 4.4207μs 226.2066 KOps/s 228.3774 KOps/s $\color{#d91a1a}-0.95\%$
test_membership_stacked_nested_leaf_last 33.0810μs 4.4126μs 226.6244 KOps/s 227.2498 KOps/s $\color{#d91a1a}-0.28\%$
test_nested_getleaf 0.1096ms 10.7878μs 92.6970 KOps/s 94.1166 KOps/s $\color{#d91a1a}-1.51\%$
test_nested_get 32.1900μs 10.1279μs 98.7368 KOps/s 99.0653 KOps/s $\color{#d91a1a}-0.33\%$
test_stacked_getleaf 37.0790μs 10.5626μs 94.6740 KOps/s 94.2444 KOps/s $\color{#35bf28}+0.46\%$
test_stacked_get 33.0520μs 10.0679μs 99.3254 KOps/s 99.4643 KOps/s $\color{#d91a1a}-0.14\%$
test_nested_getitemleaf 38.2720μs 11.1284μs 89.8604 KOps/s 89.7692 KOps/s $\color{#35bf28}+0.10\%$
test_nested_getitem 35.7070μs 10.5291μs 94.9744 KOps/s 94.8203 KOps/s $\color{#35bf28}+0.16\%$
test_stacked_getitemleaf 97.0810μs 11.0149μs 90.7865 KOps/s 89.9089 KOps/s $\color{#35bf28}+0.98\%$
test_stacked_getitem 34.1230μs 10.4366μs 95.8165 KOps/s 94.6037 KOps/s $\color{#35bf28}+1.28\%$
test_lock_nested 8.0101ms 0.4624ms 2.1626 KOps/s 1.7969 KOps/s $\textbf{\color{#35bf28}+20.35\%}$
test_lock_stack_nested 0.7875ms 0.4316ms 2.3170 KOps/s 2.3658 KOps/s $\color{#d91a1a}-2.06\%$
test_unlock_nested 0.9373ms 0.3775ms 2.6493 KOps/s 2.6828 KOps/s $\color{#d91a1a}-1.25\%$
test_unlock_stack_nested 0.5989ms 0.3484ms 2.8702 KOps/s 2.9258 KOps/s $\color{#d91a1a}-1.90\%$
test_flatten_speed 0.4773ms 0.1038ms 9.6312 KOps/s 10.1345 KOps/s $\color{#d91a1a}-4.97\%$
test_unflatten_speed 0.6273ms 0.5329ms 1.8764 KOps/s 1.9348 KOps/s $\color{#d91a1a}-3.02\%$
test_common_ops 1.9100ms 0.8077ms 1.2381 KOps/s 1.3264 KOps/s $\textbf{\color{#d91a1a}-6.66\%}$
test_creation 44.6940μs 2.5720μs 388.8034 KOps/s 404.7572 KOps/s $\color{#d91a1a}-3.94\%$
test_creation_empty 36.8180μs 13.4418μs 74.3948 KOps/s 105.3372 KOps/s $\textbf{\color{#d91a1a}-29.37\%}$
test_creation_nested_1 49.2720μs 16.5613μs 60.3816 KOps/s 81.7070 KOps/s $\textbf{\color{#d91a1a}-26.10\%}$
test_creation_nested_2 74.5690μs 20.8806μs 47.8913 KOps/s 58.5738 KOps/s $\textbf{\color{#d91a1a}-18.24\%}$
test_clone 54.9830μs 13.8258μs 72.3288 KOps/s 73.1026 KOps/s $\color{#d91a1a}-1.06\%$
test_getitem[int] 1.2290ms 13.1004μs 76.3338 KOps/s 77.3476 KOps/s $\color{#d91a1a}-1.31\%$
test_getitem[slice_int] 0.1392ms 25.0700μs 39.8882 KOps/s 39.9212 KOps/s $\color{#d91a1a}-0.08\%$
test_getitem[range] 0.1657ms 48.2914μs 20.7076 KOps/s 19.8904 KOps/s $\color{#35bf28}+4.11\%$
test_getitem[tuple] 0.1325ms 20.6038μs 48.5347 KOps/s 49.0389 KOps/s $\color{#d91a1a}-1.03\%$
test_getitem[list] 0.1693ms 44.0739μs 22.6892 KOps/s 21.9134 KOps/s $\color{#35bf28}+3.54\%$
test_setitem_dim[int] 56.2450μs 25.5227μs 39.1808 KOps/s 37.6849 KOps/s $\color{#35bf28}+3.97\%$
test_setitem_dim[slice_int] 92.1010μs 52.1629μs 19.1707 KOps/s 19.0070 KOps/s $\color{#35bf28}+0.86\%$
test_setitem_dim[range] 0.1159ms 72.7516μs 13.7454 KOps/s 13.3502 KOps/s $\color{#35bf28}+2.96\%$
test_setitem_dim[tuple] 0.1134ms 40.9603μs 24.4139 KOps/s 23.8646 KOps/s $\color{#35bf28}+2.30\%$
test_setitem 61.7450μs 21.9697μs 45.5173 KOps/s 51.1170 KOps/s $\textbf{\color{#d91a1a}-10.95\%}$
test_set 65.1910μs 21.3380μs 46.8648 KOps/s 52.6455 KOps/s $\textbf{\color{#d91a1a}-10.98\%}$
test_set_shared 4.2512ms 0.1711ms 5.8444 KOps/s 5.8271 KOps/s $\color{#35bf28}+0.30\%$
test_update 0.3706ms 24.4082μs 40.9698 KOps/s 48.8780 KOps/s $\textbf{\color{#d91a1a}-16.18\%}$
test_update_nested 80.8500μs 35.1263μs 28.4687 KOps/s 33.0236 KOps/s $\textbf{\color{#d91a1a}-13.79\%}$
test_update__nested 1.0082ms 34.2226μs 29.2204 KOps/s 29.7034 KOps/s $\color{#d91a1a}-1.63\%$
test_set_nested 78.7870μs 23.1802μs 43.1402 KOps/s 48.0020 KOps/s $\textbf{\color{#d91a1a}-10.13\%}$
test_set_nested_new 85.4990μs 27.9400μs 35.7910 KOps/s 39.1881 KOps/s $\textbf{\color{#d91a1a}-8.67\%}$
test_select 0.1176ms 44.5983μs 22.4224 KOps/s 24.4016 KOps/s $\textbf{\color{#d91a1a}-8.11\%}$
test_select_nested 0.1224ms 64.6981μs 15.4564 KOps/s 16.0853 KOps/s $\color{#d91a1a}-3.91\%$
test_exclude_nested 0.1605ms 84.0326μs 11.9001 KOps/s 12.4220 KOps/s $\color{#d91a1a}-4.20\%$
test_empty[True] 0.5622ms 0.4087ms 2.4467 KOps/s 2.4764 KOps/s $\color{#d91a1a}-1.20\%$
test_empty[False] 34.3040μs 1.5038μs 664.9603 KOps/s 726.7559 KOps/s $\textbf{\color{#d91a1a}-8.50\%}$
test_unbind_speed 0.4290ms 0.2755ms 3.6295 KOps/s 3.7090 KOps/s $\color{#d91a1a}-2.14\%$
test_unbind_speed_stack0 0.4082ms 0.2716ms 3.6821 KOps/s 3.7823 KOps/s $\color{#d91a1a}-2.65\%$
test_unbind_speed_stack1 0.1129s 0.8294ms 1.2057 KOps/s 1.3565 KOps/s $\textbf{\color{#d91a1a}-11.12\%}$
test_split 0.1044s 1.7957ms 556.9003 Ops/s 562.5401 Ops/s $\color{#d91a1a}-1.00\%$
test_chunk 3.0621ms 1.6285ms 614.0454 Ops/s 554.7686 Ops/s $\textbf{\color{#35bf28}+10.68\%}$
test_consolidate_njt[False-None] 8.3733ms 8.0165ms 124.7428 Ops/s 118.6417 Ops/s $\textbf{\color{#35bf28}+5.14\%}$
test_creation[device0] 4.2875ms 92.7868μs 10.7774 KOps/s 10.6282 KOps/s $\color{#35bf28}+1.40\%$
test_creation_from_tensor 0.2721ms 95.8106μs 10.4373 KOps/s 10.3061 KOps/s $\color{#35bf28}+1.27\%$
test_add_one[memmap_tensor0] 0.1440ms 5.0565μs 197.7633 KOps/s 201.7513 KOps/s $\color{#d91a1a}-1.98\%$
test_contiguous[memmap_tensor0] 20.3980μs 0.5183μs 1.9295 MOps/s 1.9465 MOps/s $\color{#d91a1a}-0.87\%$
test_stack[memmap_tensor0] 53.8410μs 3.4221μs 292.2158 KOps/s 293.3690 KOps/s $\color{#d91a1a}-0.39\%$
test_memmaptd_index 0.9493ms 0.2317ms 4.3166 KOps/s 4.1133 KOps/s $\color{#35bf28}+4.94\%$
test_memmaptd_index_astensor 0.5901ms 0.3169ms 3.1559 KOps/s 3.0102 KOps/s $\color{#35bf28}+4.84\%$
test_memmaptd_index_op 0.9711ms 0.6113ms 1.6359 KOps/s 1.7127 KOps/s $\color{#d91a1a}-4.48\%$
test_serialize_model 0.1256s 0.1162s 8.6022 Ops/s 8.3469 Ops/s $\color{#35bf28}+3.06\%$
test_serialize_model_pickle 0.4937s 0.4047s 2.4707 Ops/s 2.5328 Ops/s $\color{#d91a1a}-2.45\%$
test_serialize_weights 0.1291s 0.1139s 8.7765 Ops/s 8.3739 Ops/s $\color{#35bf28}+4.81\%$
test_serialize_weights_returnearly 0.2663s 0.1768s 5.6558 Ops/s 6.3680 Ops/s $\textbf{\color{#d91a1a}-11.18\%}$
test_serialize_weights_pickle 0.5564s 0.4561s 2.1927 Ops/s 2.4815 Ops/s $\textbf{\color{#d91a1a}-11.64\%}$
test_serialize_weights_filesystem 0.1508s 0.1444s 6.9245 Ops/s 6.8490 Ops/s $\color{#35bf28}+1.10\%$
test_serialize_model_filesystem 0.1532s 0.1451s 6.8921 Ops/s 6.0346 Ops/s $\textbf{\color{#35bf28}+14.21\%}$
test_reshape_pytree 82.1730μs 26.4879μs 37.7531 KOps/s 38.0296 KOps/s $\color{#d91a1a}-0.73\%$
test_reshape_td 98.5540μs 32.6995μs 30.5815 KOps/s 29.3637 KOps/s $\color{#35bf28}+4.15\%$
test_view_pytree 85.5680μs 26.6994μs 37.4541 KOps/s 37.7465 KOps/s $\color{#d91a1a}-0.77\%$
test_view_td 0.1108ms 38.2904μs 26.1162 KOps/s 25.1195 KOps/s $\color{#35bf28}+3.97\%$
test_unbind_pytree 99.2760μs 29.4235μs 33.9865 KOps/s 33.8925 KOps/s $\color{#35bf28}+0.28\%$
test_unbind_td 0.2903ms 39.9702μs 25.0186 KOps/s 25.1026 KOps/s $\color{#d91a1a}-0.33\%$
test_split_pytree 62.9170μs 29.0833μs 34.3840 KOps/s 34.1638 KOps/s $\color{#35bf28}+0.64\%$
test_split_td 0.4574ms 46.5805μs 21.4682 KOps/s 22.1353 KOps/s $\color{#d91a1a}-3.01\%$
test_add_pytree 71.1730μs 35.6355μs 28.0619 KOps/s 27.6990 KOps/s $\color{#35bf28}+1.31\%$
test_add_td 0.1336ms 59.3027μs 16.8626 KOps/s 18.8713 KOps/s $\textbf{\color{#d91a1a}-10.64\%}$
test_compile_add_one_nested[tensordict-compile] 0.2356ms 64.0096μs 15.6227 KOps/s 15.9695 KOps/s $\color{#d91a1a}-2.17\%$
test_compile_add_one_nested[tensordict-eager] 1.2393ms 0.1746ms 5.7268 KOps/s 5.7161 KOps/s $\color{#35bf28}+0.19\%$
test_compile_add_one_nested[pytree-compile] 0.1740ms 45.9232μs 21.7755 KOps/s 21.9187 KOps/s $\color{#d91a1a}-0.65\%$
test_compile_add_one_nested[pytree-eager] 0.2848ms 0.1189ms 8.4083 KOps/s 8.3939 KOps/s $\color{#35bf28}+0.17\%$
test_compile_copy_nested[tensordict-compile] 80.7210μs 26.6745μs 37.4890 KOps/s 38.3240 KOps/s $\color{#d91a1a}-2.18\%$
test_compile_copy_nested[tensordict-eager] 0.1730ms 59.0825μs 16.9255 KOps/s 17.1276 KOps/s $\color{#d91a1a}-1.18\%$
test_compile_copy_nested[pytree-compile] 0.3633ms 77.6924μs 12.8713 KOps/s 12.7929 KOps/s $\color{#35bf28}+0.61\%$
test_compile_copy_nested[pytree-eager] 0.1266ms 66.6663μs 15.0001 KOps/s 14.9432 KOps/s $\color{#35bf28}+0.38\%$
test_compile_add_one_flat[tensordict-compile] 0.2584ms 0.1048ms 9.5408 KOps/s 9.4630 KOps/s $\color{#35bf28}+0.82\%$
test_compile_add_one_flat[tensordict-eager] 0.3932ms 0.2115ms 4.7280 KOps/s 4.6274 KOps/s $\color{#35bf28}+2.17\%$
test_compile_add_one_flat[tensorclass-compile] 0.2390ms 45.4254μs 22.0141 KOps/s 21.4143 KOps/s $\color{#35bf28}+2.80\%$
test_compile_add_one_flat[tensorclass-eager] 0.5023ms 66.4343μs 15.0525 KOps/s 14.6494 KOps/s $\color{#35bf28}+2.75\%$
test_compile_add_one_flat[pytree-compile] 0.2503ms 0.1025ms 9.7571 KOps/s 9.6369 KOps/s $\color{#35bf28}+1.25\%$
test_compile_add_one_flat[pytree-eager] 0.4621ms 0.1977ms 5.0582 KOps/s 4.9444 KOps/s $\color{#35bf28}+2.30\%$
test_compile_add_self_flat[tensordict-eager] 0.5003ms 0.2278ms 4.3890 KOps/s 4.2975 KOps/s $\color{#35bf28}+2.13\%$
test_compile_add_self_flat[tensordict-compile] 0.2106ms 0.1032ms 9.6932 KOps/s 9.0870 KOps/s $\textbf{\color{#35bf28}+6.67\%}$
test_compile_add_self_flat[tensorclass-eager] 0.1423ms 62.1489μs 16.0904 KOps/s 15.7295 KOps/s $\color{#35bf28}+2.29\%$
test_compile_add_self_flat[tensorclass-compile] 0.1409ms 46.7859μs 21.3740 KOps/s 21.2906 KOps/s $\color{#35bf28}+0.39\%$
test_compile_add_self_flat[pytree-eager] 0.1943ms 0.1567ms 6.3834 KOps/s 6.3975 KOps/s $\color{#d91a1a}-0.22\%$
test_compile_add_self_flat[pytree-compile] 0.2185ms 0.1016ms 9.8383 KOps/s 9.6278 KOps/s $\color{#35bf28}+2.19\%$
test_compile_copy_flat[tensordict-compile] 77.4750μs 21.3447μs 46.8501 KOps/s 46.6089 KOps/s $\color{#35bf28}+0.52\%$
test_compile_copy_flat[tensordict-eager] 0.1268ms 65.7317μs 15.2134 KOps/s 14.7783 KOps/s $\color{#35bf28}+2.94\%$
test_compile_copy_flat[pytree-compile] 0.1516ms 77.8115μs 12.8516 KOps/s 12.4167 KOps/s $\color{#35bf28}+3.50\%$
test_compile_copy_flat[pytree-eager] 0.1358ms 66.8626μs 14.9560 KOps/s 14.8524 KOps/s $\color{#35bf28}+0.70\%$
test_compile_assign_and_add[tensordict-compile] 0.5603ms 0.2077ms 4.8140 KOps/s 4.8130 KOps/s $\color{#35bf28}+0.02\%$
test_compile_assign_and_add[tensordict-eager] 2.3384ms 1.3218ms 756.5484 Ops/s 764.1802 Ops/s $\color{#d91a1a}-1.00\%$
test_compile_assign_and_add[pytree-compile] 0.7701ms 0.2062ms 4.8499 KOps/s 4.8343 KOps/s $\color{#35bf28}+0.32\%$
test_compile_assign_and_add[pytree-eager] 0.9175ms 0.7703ms 1.2981 KOps/s 1.2174 KOps/s $\textbf{\color{#35bf28}+6.64\%}$
test_compile_assign_and_add_stack[compile] 0.8202ms 0.4550ms 2.1976 KOps/s 2.1689 KOps/s $\color{#35bf28}+1.32\%$
test_compile_assign_and_add_stack[eager] 3.0584ms 2.7401ms 364.9447 Ops/s 393.7284 Ops/s $\textbf{\color{#d91a1a}-7.31\%}$
test_compile_indexing[tensor-tensordict-compile] 0.2345ms 37.7343μs 26.5011 KOps/s 27.6066 KOps/s $\color{#d91a1a}-4.00\%$
test_compile_indexing[tensor-tensordict-eager] 0.6436ms 33.6577μs 29.7109 KOps/s 30.1967 KOps/s $\color{#d91a1a}-1.61\%$
test_compile_indexing[tensor-tensorclass-compile] 98.3730μs 29.9385μs 33.4018 KOps/s 34.2140 KOps/s $\color{#d91a1a}-2.37\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2096ms 23.2320μs 43.0440 KOps/s 43.3744 KOps/s $\color{#d91a1a}-0.76\%$
test_compile_indexing[tensor-pytree-compile] 96.0100μs 30.8566μs 32.4080 KOps/s 33.6282 KOps/s $\color{#d91a1a}-3.63\%$
test_compile_indexing[tensor-pytree-eager] 0.1091ms 23.0572μs 43.3703 KOps/s 43.8603 KOps/s $\color{#d91a1a}-1.12\%$
test_compile_indexing[slice-tensordict-compile] 0.1392ms 52.3701μs 19.0949 KOps/s 19.0326 KOps/s $\color{#35bf28}+0.33\%$
test_compile_indexing[slice-tensordict-eager] 0.6733ms 20.4640μs 48.8664 KOps/s 48.5273 KOps/s $\color{#35bf28}+0.70\%$
test_compile_indexing[slice-tensorclass-compile] 0.1653ms 44.7785μs 22.3322 KOps/s 22.7484 KOps/s $\color{#d91a1a}-1.83\%$
test_compile_indexing[slice-tensorclass-eager] 0.1077ms 18.8925μs 52.9311 KOps/s 54.2203 KOps/s $\color{#d91a1a}-2.38\%$
test_compile_indexing[slice-pytree-compile] 0.1129ms 45.0179μs 22.2134 KOps/s 22.3369 KOps/s $\color{#d91a1a}-0.55\%$
test_compile_indexing[slice-pytree-eager] 68.1770μs 18.7976μs 53.1983 KOps/s 54.7352 KOps/s $\color{#d91a1a}-2.81\%$
test_compile_indexing[int-tensordict-compile] 0.2699ms 53.0842μs 18.8380 KOps/s 18.2959 KOps/s $\color{#35bf28}+2.96\%$
test_compile_indexing[int-tensordict-eager] 1.0086ms 20.2937μs 49.2764 KOps/s 50.3167 KOps/s $\color{#d91a1a}-2.07\%$
test_compile_indexing[int-tensorclass-compile] 0.1168ms 45.2870μs 22.0814 KOps/s 22.1497 KOps/s $\color{#d91a1a}-0.31\%$
test_compile_indexing[int-tensorclass-eager] 83.8600μs 18.7685μs 53.2809 KOps/s 54.1608 KOps/s $\color{#d91a1a}-1.62\%$
test_compile_indexing[int-pytree-compile] 0.1391ms 45.5874μs 21.9359 KOps/s 22.2767 KOps/s $\color{#d91a1a}-1.53\%$
test_compile_indexing[int-pytree-eager] 88.1240μs 19.0935μs 52.3739 KOps/s 54.2129 KOps/s $\color{#d91a1a}-3.39\%$
test_mod_add[eager] 0.1007ms 35.0369μs 28.5413 KOps/s 30.2198 KOps/s $\textbf{\color{#d91a1a}-5.55\%}$
test_mod_add[compile] 0.1736ms 49.4849μs 20.2082 KOps/s 20.5096 KOps/s $\color{#d91a1a}-1.47\%$
test_mod_add[compile-overhead] 0.1560ms 48.9675μs 20.4217 KOps/s 20.3653 KOps/s $\color{#35bf28}+0.28\%$
test_mod_wrap[eager] 0.5347ms 0.2195ms 4.5561 KOps/s 4.4726 KOps/s $\color{#35bf28}+1.87\%$
test_mod_wrap[compile] 0.5324ms 0.2082ms 4.8029 KOps/s 4.7935 KOps/s $\color{#35bf28}+0.20\%$
test_mod_wrap[compile-overhead] 0.4206ms 0.2049ms 4.8802 KOps/s 4.8319 KOps/s $\color{#35bf28}+1.00\%$
test_mod_wrap_and_backward[eager] 12.3741ms 10.9780ms 91.0914 Ops/s 87.3708 Ops/s $\color{#35bf28}+4.26\%$
test_mod_wrap_and_backward[compile] 12.3040ms 10.9432ms 91.3808 Ops/s 72.8930 Ops/s $\textbf{\color{#35bf28}+25.36\%}$
test_mod_wrap_and_backward[compile-overhead] 12.7825ms 11.0645ms 90.3789 Ops/s 77.2105 Ops/s $\textbf{\color{#35bf28}+17.06\%}$
test_seq_add[eager] 0.4073ms 0.1200ms 8.3305 KOps/s 8.6779 KOps/s $\color{#d91a1a}-4.00\%$
test_seq_add[compile] 0.1437ms 65.6847μs 15.2242 KOps/s 16.0620 KOps/s $\textbf{\color{#d91a1a}-5.22\%}$
test_seq_add[compile-overhead] 0.1954ms 63.8782μs 15.6548 KOps/s 16.0536 KOps/s $\color{#d91a1a}-2.48\%$
test_seq_wrap[eager] 0.6229ms 0.4460ms 2.2421 KOps/s 2.2555 KOps/s $\color{#d91a1a}-0.60\%$
test_seq_wrap[compile] 0.4305ms 0.2338ms 4.2779 KOps/s 4.3815 KOps/s $\color{#d91a1a}-2.36\%$
test_seq_wrap[compile-overhead] 0.4514ms 0.2298ms 4.3520 KOps/s 4.4140 KOps/s $\color{#d91a1a}-1.41\%$
test_func_call_runtime[False-eager] 0.7527ms 0.5366ms 1.8636 KOps/s 1.8333 KOps/s $\color{#35bf28}+1.65\%$
test_func_call_runtime[False-compile] 0.5471ms 0.4287ms 2.3327 KOps/s 2.3219 KOps/s $\color{#35bf28}+0.47\%$
test_func_call_runtime[False-compile-overhead] 0.5294ms 0.4272ms 2.3411 KOps/s 2.3354 KOps/s $\color{#35bf28}+0.24\%$
test_func_call_runtime[True-eager] 1.0641ms 0.7428ms 1.3462 KOps/s 1.3357 KOps/s $\color{#35bf28}+0.79\%$
test_func_call_runtime[True-compile] 0.9286ms 0.4720ms 2.1184 KOps/s 2.1266 KOps/s $\color{#d91a1a}-0.38\%$
test_func_call_runtime[True-compile-overhead] 0.6170ms 0.4729ms 2.1145 KOps/s 2.1454 KOps/s $\color{#d91a1a}-1.44\%$
test_func_call_cm_runtime[False-eager] 0.9017ms 0.5363ms 1.8647 KOps/s 1.8590 KOps/s $\color{#35bf28}+0.30\%$
test_func_call_cm_runtime[False-compile] 0.7954ms 0.4316ms 2.3168 KOps/s 2.3620 KOps/s $\color{#d91a1a}-1.91\%$
test_func_call_cm_runtime[False-compile-overhead] 0.9542ms 0.4297ms 2.3273 KOps/s 2.3343 KOps/s $\color{#d91a1a}-0.30\%$
test_func_call_cm_runtime[True-eager] 1.2443ms 0.8888ms 1.1251 KOps/s 1.1110 KOps/s $\color{#35bf28}+1.26\%$
test_func_call_cm_runtime[True-compile] 0.8576ms 0.4955ms 2.0183 KOps/s 2.0236 KOps/s $\color{#d91a1a}-0.26\%$
test_func_call_cm_runtime[True-compile-overhead] 0.8277ms 0.4981ms 2.0078 KOps/s 2.0226 KOps/s $\color{#d91a1a}-0.73\%$
test_vmap_func_call_cm_runtime[eager] 2.6871ms 1.9067ms 524.4755 Ops/s 521.4270 Ops/s $\color{#35bf28}+0.58\%$
test_vmap_func_call_cm_runtime[compile] 0.7842ms 0.5170ms 1.9344 KOps/s 1.9137 KOps/s $\color{#35bf28}+1.08\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.9201ms 0.5216ms 1.9171 KOps/s 1.9022 KOps/s $\color{#35bf28}+0.79\%$
test_distributed 0.2653ms 0.1259ms 7.9407 KOps/s 7.5138 KOps/s $\textbf{\color{#35bf28}+5.68\%}$
test_tdmodule 85.9200μs 26.4485μs 37.8093 KOps/s 39.5222 KOps/s $\color{#d91a1a}-4.33\%$
test_tdmodule_dispatch 92.7530μs 50.2864μs 19.8861 KOps/s 22.1510 KOps/s $\textbf{\color{#d91a1a}-10.22\%}$
test_tdseq 59.5510μs 30.5913μs 32.6891 KOps/s 37.0589 KOps/s $\textbf{\color{#d91a1a}-11.79\%}$
test_tdseq_dispatch 88.8460μs 56.8898μs 17.5778 KOps/s 19.7436 KOps/s $\textbf{\color{#d91a1a}-10.97\%}$
test_instantiation_functorch 2.1888ms 1.5079ms 663.1709 Ops/s 647.3114 Ops/s $\color{#35bf28}+2.45\%$
test_exec_functorch 0.3175ms 0.1816ms 5.5053 KOps/s 5.5630 KOps/s $\color{#d91a1a}-1.04\%$
test_exec_functional_call 0.3180ms 0.1723ms 5.8034 KOps/s 5.7565 KOps/s $\color{#35bf28}+0.81\%$
test_exec_td_decorator 0.4848ms 0.2322ms 4.3062 KOps/s 4.2154 KOps/s $\color{#35bf28}+2.15\%$
test_vmap_mlp_speed_decorator[True-True] 0.8368ms 0.6453ms 1.5496 KOps/s 1.5512 KOps/s $\color{#d91a1a}-0.10\%$
test_vmap_mlp_speed_decorator[True-False] 0.9173ms 0.6416ms 1.5585 KOps/s 1.5488 KOps/s $\color{#35bf28}+0.63\%$
test_vmap_mlp_speed_decorator[False-True] 0.7814ms 0.5217ms 1.9167 KOps/s 1.8756 KOps/s $\color{#35bf28}+2.19\%$
test_vmap_mlp_speed_decorator[False-False] 0.9348ms 0.5253ms 1.9038 KOps/s 1.8893 KOps/s $\color{#35bf28}+0.77\%$
test_to_module_speed[True] 2.2186ms 1.3209ms 757.0715 Ops/s 743.4904 Ops/s $\color{#35bf28}+1.83\%$
test_to_module_speed[False] 1.9408ms 1.2860ms 777.5877 Ops/s 769.1419 Ops/s $\color{#35bf28}+1.10\%$
test_tc_init 89.0460μs 47.4220μs 21.0872 KOps/s 21.8572 KOps/s $\color{#d91a1a}-3.52\%$
test_tc_init_nested 0.2160ms 95.6714μs 10.4524 KOps/s 10.8073 KOps/s $\color{#d91a1a}-3.28\%$
test_tc_first_layer_tensor 17.1020μs 1.5335μs 652.1198 KOps/s 643.0244 KOps/s $\color{#35bf28}+1.41\%$
test_tc_first_layer_nontensor 23.7950μs 4.6349μs 215.7541 KOps/s 212.6060 KOps/s $\color{#35bf28}+1.48\%$
test_tc_second_layer_tensor 41.2370μs 2.8009μs 357.0219 KOps/s 348.7563 KOps/s $\color{#35bf28}+2.37\%$
test_tc_second_layer_nontensor 54.4620μs 5.9851μs 167.0813 KOps/s 164.5807 KOps/s $\color{#35bf28}+1.52\%$
test_unbind 0.2401s 14.1453ms 70.6950 Ops/s 77.6404 Ops/s $\textbf{\color{#d91a1a}-8.95\%}$
test_full_like 12.8849ms 9.1157ms 109.7006 Ops/s 80.5217 Ops/s $\textbf{\color{#35bf28}+36.24\%}$
test_zeros_like 3.6105ms 3.0884ms 323.7962 Ops/s 127.2016 Ops/s $\textbf{\color{#35bf28}+154.55\%}$
test_ones_like 4.0997ms 3.4906ms 286.4838 Ops/s 124.5687 Ops/s $\textbf{\color{#35bf28}+129.98\%}$
test_clone 6.1469ms 5.4104ms 184.8289 Ops/s 105.2500 Ops/s $\textbf{\color{#35bf28}+75.61\%}$
test_squeeze 65.3420μs 12.1372μs 82.3914 KOps/s 82.3183 KOps/s $\color{#35bf28}+0.09\%$
test_unsqueeze 0.1714ms 90.3071μs 11.0733 KOps/s 11.0308 KOps/s $\color{#35bf28}+0.39\%$
test_split 0.4723ms 0.1947ms 5.1362 KOps/s 5.1149 KOps/s $\color{#35bf28}+0.42\%$
test_permute 0.5393ms 0.1944ms 5.1430 KOps/s 4.8504 KOps/s $\textbf{\color{#35bf28}+6.03\%}$
test_stack 31.4204ms 25.9278ms 38.5686 Ops/s 39.6751 Ops/s $\color{#d91a1a}-2.79\%$
test_cat 31.6721ms 25.6890ms 38.9271 Ops/s 39.5518 Ops/s $\color{#d91a1a}-1.58\%$

Copy link

github-actions bot commented Jan 14, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}17$. Worsened: $\large\color{#d91a1a}10$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 28.8010μs 12.6360μs 79.1390 KOps/s 79.4643 KOps/s $\color{#d91a1a}-0.41\%$
test_plain_set_stack_nested 40.5000μs 12.8580μs 77.7728 KOps/s 78.6394 KOps/s $\color{#d91a1a}-1.10\%$
test_plain_set_nested_inplace 0.4025ms 13.7000μs 72.9926 KOps/s 73.2416 KOps/s $\color{#d91a1a}-0.34\%$
test_plain_set_stack_nested_inplace 0.2016ms 13.7940μs 72.4955 KOps/s 73.1424 KOps/s $\color{#d91a1a}-0.88\%$
test_items 45.5410μs 2.9214μs 342.3045 KOps/s 339.9020 KOps/s $\color{#35bf28}+0.71\%$
test_items_nested 0.7671ms 0.3610ms 2.7700 KOps/s 2.7465 KOps/s $\color{#35bf28}+0.86\%$
test_items_nested_locked 0.7598ms 0.3644ms 2.7439 KOps/s 2.7350 KOps/s $\color{#35bf28}+0.33\%$
test_items_nested_leaf 0.4767ms 58.3399μs 17.1409 KOps/s 17.2987 KOps/s $\color{#d91a1a}-0.91\%$
test_items_stack_nested 0.7748ms 0.3655ms 2.7358 KOps/s 2.7238 KOps/s $\color{#35bf28}+0.44\%$
test_items_stack_nested_leaf 0.1031ms 60.1812μs 16.6165 KOps/s 16.7964 KOps/s $\color{#d91a1a}-1.07\%$
test_items_stack_nested_locked 0.7714ms 0.3657ms 2.7341 KOps/s 2.7283 KOps/s $\color{#35bf28}+0.21\%$
test_keys 0.4084ms 3.4814μs 287.2423 KOps/s 289.3488 KOps/s $\color{#d91a1a}-0.73\%$
test_keys_nested 0.4827ms 87.7129μs 11.4008 KOps/s 11.3857 KOps/s $\color{#35bf28}+0.13\%$
test_keys_nested_locked 0.7921ms 93.7056μs 10.6717 KOps/s 10.6989 KOps/s $\color{#d91a1a}-0.25\%$
test_keys_nested_leaf 0.4738ms 79.2086μs 12.6249 KOps/s 12.9069 KOps/s $\color{#d91a1a}-2.19\%$
test_keys_stack_nested 0.1221ms 89.1559μs 11.2163 KOps/s 11.2851 KOps/s $\color{#d91a1a}-0.61\%$
test_keys_stack_nested_leaf 0.4974ms 80.6762μs 12.3952 KOps/s 12.6455 KOps/s $\color{#d91a1a}-1.98\%$
test_keys_stack_nested_locked 0.5134ms 95.4641μs 10.4751 KOps/s 10.6555 KOps/s $\color{#d91a1a}-1.69\%$
test_values 67.3895μs 0.8615μs 1.1607 MOps/s 1.1658 MOps/s $\color{#d91a1a}-0.43\%$
test_values_nested 0.4364ms 37.6510μs 26.5597 KOps/s 26.9254 KOps/s $\color{#d91a1a}-1.36\%$
test_values_nested_locked 0.4364ms 39.1383μs 25.5504 KOps/s 25.9511 KOps/s $\color{#d91a1a}-1.54\%$
test_values_nested_leaf 0.4380ms 41.9690μs 23.8271 KOps/s 24.0856 KOps/s $\color{#d91a1a}-1.07\%$
test_values_stack_nested 0.4498ms 38.2031μs 26.1759 KOps/s 26.2754 KOps/s $\color{#d91a1a}-0.38\%$
test_values_stack_nested_leaf 81.9120μs 42.7058μs 23.4160 KOps/s 23.7881 KOps/s $\color{#d91a1a}-1.56\%$
test_values_stack_nested_locked 0.4378ms 39.9701μs 25.0187 KOps/s 25.3387 KOps/s $\color{#d91a1a}-1.26\%$
test_membership 20.7304μs 0.5117μs 1.9542 MOps/s 1.9668 MOps/s $\color{#d91a1a}-0.64\%$
test_membership_nested 0.2119ms 1.9704μs 507.5000 KOps/s 492.8007 KOps/s $\color{#35bf28}+2.98\%$
test_membership_nested_leaf 0.2154ms 1.9876μs 503.1275 KOps/s 509.0535 KOps/s $\color{#d91a1a}-1.16\%$
test_membership_stacked_nested 20.9100μs 2.0272μs 493.2829 KOps/s 486.6987 KOps/s $\color{#35bf28}+1.35\%$
test_membership_stacked_nested_leaf 0.4079ms 2.0196μs 495.1362 KOps/s 491.5500 KOps/s $\color{#35bf28}+0.73\%$
test_membership_nested_last 41.5600μs 3.0468μs 328.2143 KOps/s 323.9584 KOps/s $\color{#35bf28}+1.31\%$
test_membership_nested_leaf_last 0.3996ms 3.0843μs 324.2175 KOps/s 327.2971 KOps/s $\color{#d91a1a}-0.94\%$
test_membership_stacked_nested_last 37.4110μs 3.7814μs 264.4519 KOps/s 260.5690 KOps/s $\color{#35bf28}+1.49\%$
test_membership_stacked_nested_leaf_last 44.8510μs 3.8310μs 261.0315 KOps/s 265.5028 KOps/s $\color{#d91a1a}-1.68\%$
test_nested_getleaf 0.4277ms 6.1580μs 162.3916 KOps/s 162.8336 KOps/s $\color{#d91a1a}-0.27\%$
test_nested_get 0.4083ms 5.8630μs 170.5600 KOps/s 171.6812 KOps/s $\color{#d91a1a}-0.65\%$
test_stacked_getleaf 49.1610μs 6.1631μs 162.2555 KOps/s 163.8031 KOps/s $\color{#d91a1a}-0.94\%$
test_stacked_get 33.0210μs 5.8258μs 171.6494 KOps/s 173.6668 KOps/s $\color{#d91a1a}-1.16\%$
test_nested_getitemleaf 53.5010μs 6.3853μs 156.6103 KOps/s 156.9453 KOps/s $\color{#d91a1a}-0.21\%$
test_nested_getitem 0.4268ms 6.0931μs 164.1214 KOps/s 164.9916 KOps/s $\color{#d91a1a}-0.53\%$
test_stacked_getitemleaf 33.6510μs 6.4240μs 155.6661 KOps/s 156.7016 KOps/s $\color{#d91a1a}-0.66\%$
test_stacked_getitem 0.4305ms 6.1217μs 163.3527 KOps/s 163.1128 KOps/s $\color{#35bf28}+0.15\%$
test_lock_nested 0.7787ms 0.3695ms 2.7061 KOps/s 2.6529 KOps/s $\color{#35bf28}+2.01\%$
test_lock_stack_nested 0.4156ms 0.3387ms 2.9528 KOps/s 2.9711 KOps/s $\color{#d91a1a}-0.62\%$
test_unlock_nested 0.7001ms 0.3120ms 3.2054 KOps/s 3.2222 KOps/s $\color{#d91a1a}-0.52\%$
test_unlock_stack_nested 0.3961ms 0.2791ms 3.5826 KOps/s 3.6163 KOps/s $\color{#d91a1a}-0.93\%$
test_flatten_speed 0.4957ms 74.8266μs 13.3642 KOps/s 13.4923 KOps/s $\color{#d91a1a}-0.95\%$
test_unflatten_speed 0.7295ms 0.3171ms 3.1535 KOps/s 3.1560 KOps/s $\color{#d91a1a}-0.08\%$
test_common_ops 1.6485ms 0.6364ms 1.5713 KOps/s 1.6045 KOps/s $\color{#d91a1a}-2.07\%$
test_creation 0.1054ms 1.7302μs 577.9818 KOps/s 578.7302 KOps/s $\color{#d91a1a}-0.13\%$
test_creation_empty 43.9700μs 9.0000μs 111.1116 KOps/s 111.7880 KOps/s $\color{#d91a1a}-0.61\%$
test_creation_nested_1 35.9500μs 10.6815μs 93.6201 KOps/s 93.7699 KOps/s $\color{#d91a1a}-0.16\%$
test_creation_nested_2 55.6410μs 13.4337μs 74.4396 KOps/s 74.7594 KOps/s $\color{#d91a1a}-0.43\%$
test_clone 95.7110μs 10.8261μs 92.3696 KOps/s 93.6198 KOps/s $\color{#d91a1a}-1.34\%$
test_getitem[int] 1.4978ms 10.5823μs 94.4972 KOps/s 94.3335 KOps/s $\color{#35bf28}+0.17\%$
test_getitem[slice_int] 0.1058ms 20.8162μs 48.0394 KOps/s 48.8339 KOps/s $\color{#d91a1a}-1.63\%$
test_getitem[range] 0.1526ms 37.6808μs 26.5387 KOps/s 27.1785 KOps/s $\color{#d91a1a}-2.35\%$
test_getitem[tuple] 0.1120ms 18.0116μs 55.5197 KOps/s 55.7979 KOps/s $\color{#d91a1a}-0.50\%$
test_getitem[list] 0.2467ms 34.0303μs 29.3856 KOps/s 30.6461 KOps/s $\color{#d91a1a}-4.11\%$
test_setitem_dim[int] 39.0510μs 19.8906μs 50.2749 KOps/s 52.9328 KOps/s $\textbf{\color{#d91a1a}-5.02\%}$
test_setitem_dim[slice_int] 86.1810μs 38.9423μs 25.6790 KOps/s 26.3875 KOps/s $\color{#d91a1a}-2.69\%$
test_setitem_dim[range] 79.8720μs 53.2165μs 18.7912 KOps/s 19.0993 KOps/s $\color{#d91a1a}-1.61\%$
test_setitem_dim[tuple] 54.7810μs 32.6934μs 30.5872 KOps/s 31.3968 KOps/s $\color{#d91a1a}-2.58\%$
test_setitem 0.2072ms 15.6575μs 63.8673 KOps/s 63.9478 KOps/s $\color{#d91a1a}-0.13\%$
test_set 0.1008ms 15.3735μs 65.0470 KOps/s 65.7743 KOps/s $\color{#d91a1a}-1.11\%$
test_set_shared 1.5221ms 0.1515ms 6.6026 KOps/s 6.6049 KOps/s $\color{#d91a1a}-0.04\%$
test_update 1.0424ms 18.4463μs 54.2114 KOps/s 50.9721 KOps/s $\textbf{\color{#35bf28}+6.35\%}$
test_update_nested 96.9620μs 23.8896μs 41.8592 KOps/s 41.2168 KOps/s $\color{#35bf28}+1.56\%$
test_update__nested 0.1296ms 25.7101μs 38.8951 KOps/s 39.5396 KOps/s $\color{#d91a1a}-1.63\%$
test_set_nested 97.6210μs 16.0920μs 62.1427 KOps/s 60.5397 KOps/s $\color{#35bf28}+2.65\%$
test_set_nested_new 93.9920μs 19.1217μs 52.2966 KOps/s 52.5143 KOps/s $\color{#d91a1a}-0.41\%$
test_select 0.1183ms 31.0543μs 32.2016 KOps/s 32.3276 KOps/s $\color{#d91a1a}-0.39\%$
test_select_nested 88.4320μs 43.6341μs 22.9178 KOps/s 22.9291 KOps/s $\color{#d91a1a}-0.05\%$
test_exclude_nested 0.1065ms 63.0735μs 15.8545 KOps/s 16.3064 KOps/s $\color{#d91a1a}-2.77\%$
test_empty[True] 0.4329ms 0.3006ms 3.3265 KOps/s 3.3690 KOps/s $\color{#d91a1a}-1.26\%$
test_empty[False] 39.9987μs 0.8108μs 1.2334 MOps/s 1.2122 MOps/s $\color{#35bf28}+1.75\%$
test_to 86.4720μs 54.6660μs 18.2929 KOps/s 16.9875 KOps/s $\textbf{\color{#35bf28}+7.68\%}$
test_to_nonblocking 0.1991ms 47.3760μs 21.1077 KOps/s 20.0496 KOps/s $\textbf{\color{#35bf28}+5.28\%}$
test_unbind_speed 1.7436ms 0.2325ms 4.3004 KOps/s 4.3092 KOps/s $\color{#d91a1a}-0.20\%$
test_unbind_speed_stack0 0.3706ms 0.2338ms 4.2779 KOps/s 4.3236 KOps/s $\color{#d91a1a}-1.06\%$
test_unbind_speed_stack1 93.1921ms 0.6576ms 1.5207 KOps/s 1.5175 KOps/s $\color{#35bf28}+0.21\%$
test_split 95.6654ms 1.5887ms 629.4518 Ops/s 583.1948 Ops/s $\textbf{\color{#35bf28}+7.93\%}$
test_chunk 95.2452ms 1.5826ms 631.8824 Ops/s 695.6165 Ops/s $\textbf{\color{#d91a1a}-9.16\%}$
test_consolidate[False-None] 97.2070ms 2.9156ms 342.9864 Ops/s 340.8040 Ops/s $\color{#35bf28}+0.64\%$
test_consolidate[default-None] 1.8280ms 1.6674ms 599.7328 Ops/s 601.2249 Ops/s $\color{#d91a1a}-0.25\%$
test_consolidate[reduce-overhead-None] 1.8853ms 1.7064ms 586.0240 Ops/s 588.6689 Ops/s $\color{#d91a1a}-0.45\%$
test_consolidate_njt[False-None] 6.5827ms 6.4777ms 154.3758 Ops/s 154.0551 Ops/s $\color{#35bf28}+0.21\%$
test_to[False-False-None] 1.8404ms 1.6891ms 592.0373 Ops/s 588.5170 Ops/s $\color{#35bf28}+0.60\%$
test_to[True-False-None] 1.7702ms 1.3313ms 751.1506 Ops/s 772.8905 Ops/s $\color{#d91a1a}-2.81\%$
test_to[within-False-None] 4.2128ms 4.0807ms 245.0550 Ops/s 243.3478 Ops/s $\color{#35bf28}+0.70\%$
test_to[True-default-None] 5.4571ms 5.3122ms 188.2474 Ops/s 189.7531 Ops/s $\color{#d91a1a}-0.79\%$
test_to_njt[False-False-None] 6.9885ms 6.8366ms 146.2717 Ops/s 144.2820 Ops/s $\color{#35bf28}+1.38\%$
test_to_njt[True-False-None] 5.6663ms 5.4499ms 183.4908 Ops/s 184.7188 Ops/s $\color{#d91a1a}-0.66\%$
test_to_njt[within-False-None] 12.2051ms 12.0071ms 83.2844 Ops/s 82.8790 Ops/s $\color{#35bf28}+0.49\%$
test_creation[device0] 0.3745ms 79.7022μs 12.5467 KOps/s 11.9475 KOps/s $\textbf{\color{#35bf28}+5.02\%}$
test_creation_from_tensor 0.5421ms 83.4786μs 11.9791 KOps/s 11.5037 KOps/s $\color{#35bf28}+4.13\%$
test_add_one[memmap_tensor0] 0.4343ms 6.8663μs 145.6382 KOps/s 148.7211 KOps/s $\color{#d91a1a}-2.07\%$
test_contiguous[memmap_tensor0] 1.7840μs 0.4043μs 2.4733 MOps/s 2.4972 MOps/s $\color{#d91a1a}-0.96\%$
test_stack[memmap_tensor0] 37.0510μs 4.3499μs 229.8908 KOps/s 237.8936 KOps/s $\color{#d91a1a}-3.36\%$
test_memmaptd_index 1.6517ms 0.2542ms 3.9341 KOps/s 4.1236 KOps/s $\color{#d91a1a}-4.60\%$
test_memmaptd_index_astensor 0.9365ms 0.3157ms 3.1676 KOps/s 3.2712 KOps/s $\color{#d91a1a}-3.17\%$
test_memmaptd_index_op 1.0024ms 0.6031ms 1.6580 KOps/s 1.7271 KOps/s $\color{#d91a1a}-4.00\%$
test_serialize_model 0.1312s 0.1302s 7.6816 Ops/s 7.6507 Ops/s $\color{#35bf28}+0.40\%$
test_serialize_model_pickle 1.3504s 1.1924s 0.8386 Ops/s 0.8225 Ops/s $\color{#35bf28}+1.97\%$
test_serialize_weights 0.1322s 0.1303s 7.6763 Ops/s 7.6865 Ops/s $\color{#d91a1a}-0.13\%$
test_serialize_weights_returnearly 0.5040s 71.9870ms 13.8914 Ops/s 14.6701 Ops/s $\textbf{\color{#d91a1a}-5.31\%}$
test_serialize_weights_pickle 1.3771s 1.1946s 0.8371 Ops/s 0.8199 Ops/s $\color{#35bf28}+2.09\%$
test_reshape_pytree 0.1127ms 21.8656μs 45.7339 KOps/s 45.1356 KOps/s $\color{#35bf28}+1.33\%$
test_reshape_td 48.5900μs 25.8960μs 38.6160 KOps/s 36.4544 KOps/s $\textbf{\color{#35bf28}+5.93\%}$
test_view_pytree 0.1719ms 21.8560μs 45.7540 KOps/s 46.2352 KOps/s $\color{#d91a1a}-1.04\%$
test_view_td 0.1576ms 30.2989μs 33.0045 KOps/s 30.9353 KOps/s $\textbf{\color{#35bf28}+6.69\%}$
test_unbind_pytree 0.1339ms 27.6445μs 36.1736 KOps/s 36.4197 KOps/s $\color{#d91a1a}-0.68\%$
test_unbind_td 0.7664ms 36.1885μs 27.6331 KOps/s 27.6710 KOps/s $\color{#d91a1a}-0.14\%$
test_split_pytree 0.1495ms 29.7568μs 33.6058 KOps/s 34.0359 KOps/s $\color{#d91a1a}-1.26\%$
test_split_td 0.9804ms 38.1604μs 26.2052 KOps/s 25.7727 KOps/s $\color{#35bf28}+1.68\%$
test_add_pytree 0.1079ms 34.6243μs 28.8815 KOps/s 28.9885 KOps/s $\color{#d91a1a}-0.37\%$
test_add_td 88.2720μs 49.9785μs 20.0086 KOps/s 19.8795 KOps/s $\color{#35bf28}+0.65\%$
test_compile_add_one_nested[tensordict-compile] 0.1705ms 0.1193ms 8.3839 KOps/s 8.0859 KOps/s $\color{#35bf28}+3.69\%$
test_compile_add_one_nested[tensordict-eager] 0.2739ms 0.1312ms 7.6206 KOps/s 7.5067 KOps/s $\color{#35bf28}+1.52\%$
test_compile_add_one_nested[pytree-compile] 0.2377ms 94.9048μs 10.5369 KOps/s 10.4138 KOps/s $\color{#35bf28}+1.18\%$
test_compile_add_one_nested[pytree-eager] 0.3348ms 0.1474ms 6.7853 KOps/s 6.6847 KOps/s $\color{#35bf28}+1.50\%$
test_compile_copy_nested[tensordict-compile] 0.1859ms 24.8034μs 40.3170 KOps/s 41.0853 KOps/s $\color{#d91a1a}-1.87\%$
test_compile_copy_nested[tensordict-eager] 65.9910μs 29.1124μs 34.3496 KOps/s 34.0429 KOps/s $\color{#35bf28}+0.90\%$
test_compile_copy_nested[pytree-compile] 0.4405ms 63.6862μs 15.7020 KOps/s 15.3392 KOps/s $\color{#35bf28}+2.37\%$
test_compile_copy_nested[pytree-eager] 87.1720μs 48.7590μs 20.5090 KOps/s 20.1237 KOps/s $\color{#35bf28}+1.91\%$
test_compile_add_one_flat[tensordict-compile] 0.1806ms 0.1402ms 7.1333 KOps/s 6.9994 KOps/s $\color{#35bf28}+1.91\%$
test_compile_add_one_flat[tensordict-eager] 0.3496ms 0.2160ms 4.6289 KOps/s 4.6042 KOps/s $\color{#35bf28}+0.54\%$
test_compile_add_one_flat[tensorclass-compile] 0.2413ms 96.9231μs 10.3175 KOps/s 10.3134 KOps/s $\color{#35bf28}+0.04\%$
test_compile_add_one_flat[tensorclass-eager] 0.1973ms 54.5533μs 18.3307 KOps/s 18.3919 KOps/s $\color{#d91a1a}-0.33\%$
test_compile_add_one_flat[pytree-compile] 0.1737ms 0.1344ms 7.4402 KOps/s 7.4327 KOps/s $\color{#35bf28}+0.10\%$
test_compile_add_one_flat[pytree-eager] 0.6278ms 0.4731ms 2.1137 KOps/s 2.0922 KOps/s $\color{#35bf28}+1.03\%$
test_compile_add_self_flat[tensordict-eager] 0.4086ms 0.2603ms 3.8421 KOps/s 3.8423 KOps/s $-0.00\%$
test_compile_add_self_flat[tensordict-compile] 0.2586ms 0.1413ms 7.0788 KOps/s 7.0932 KOps/s $\color{#d91a1a}-0.20\%$
test_compile_add_self_flat[tensorclass-eager] 0.2249ms 66.2841μs 15.0866 KOps/s 14.9813 KOps/s $\color{#35bf28}+0.70\%$
test_compile_add_self_flat[tensorclass-compile] 0.1520ms 99.7629μs 10.0238 KOps/s 10.1902 KOps/s $\color{#d91a1a}-1.63\%$
test_compile_add_self_flat[pytree-eager] 0.5449ms 0.4038ms 2.4762 KOps/s 2.4854 KOps/s $\color{#d91a1a}-0.37\%$
test_compile_add_self_flat[pytree-compile] 0.1733ms 0.1345ms 7.4327 KOps/s 7.5231 KOps/s $\color{#d91a1a}-1.20\%$
test_compile_copy_flat[tensordict-compile] 87.5010μs 19.4273μs 51.4740 KOps/s 54.8891 KOps/s $\textbf{\color{#d91a1a}-6.22\%}$
test_compile_copy_flat[tensordict-eager] 60.4210μs 31.2913μs 31.9578 KOps/s 31.4302 KOps/s $\color{#35bf28}+1.68\%$
test_compile_copy_flat[pytree-compile] 0.1071ms 70.0123μs 14.2832 KOps/s 14.3504 KOps/s $\color{#d91a1a}-0.47\%$
test_compile_copy_flat[pytree-eager] 0.1551ms 50.5605μs 19.7783 KOps/s 19.4746 KOps/s $\color{#35bf28}+1.56\%$
test_compile_assign_and_add[tensordict-compile] 1.5853ms 0.3896ms 2.5670 KOps/s 2.2511 KOps/s $\textbf{\color{#35bf28}+14.03\%}$
test_compile_assign_and_add[tensordict-eager] 2.7948ms 2.5729ms 388.6630 Ops/s 382.9095 Ops/s $\color{#35bf28}+1.50\%$
test_compile_assign_and_add[pytree-compile] 1.5779ms 0.4376ms 2.2851 KOps/s 2.2970 KOps/s $\color{#d91a1a}-0.52\%$
test_compile_assign_and_add[pytree-eager] 2.8266ms 2.5815ms 387.3773 Ops/s 386.6520 Ops/s $\color{#35bf28}+0.19\%$
test_compile_indexing[tensor-tensordict-compile] 0.4861ms 0.1179ms 8.4805 KOps/s 8.7881 KOps/s $\color{#d91a1a}-3.50\%$
test_compile_indexing[tensor-tensordict-eager] 0.6014ms 80.7422μs 12.3851 KOps/s 12.4136 KOps/s $\color{#d91a1a}-0.23\%$
test_compile_indexing[tensor-tensorclass-compile] 0.5126ms 0.1102ms 9.0715 KOps/s 9.6022 KOps/s $\textbf{\color{#d91a1a}-5.53\%}$
test_compile_indexing[tensor-tensorclass-eager] 2.7094ms 71.8635μs 13.9153 KOps/s 15.0180 KOps/s $\textbf{\color{#d91a1a}-7.34\%}$
test_compile_indexing[tensor-pytree-compile] 0.1597ms 0.1101ms 9.0827 KOps/s 9.5506 KOps/s $\color{#d91a1a}-4.90\%$
test_compile_indexing[tensor-pytree-eager] 0.2410ms 71.7396μs 13.9393 KOps/s 14.9660 KOps/s $\textbf{\color{#d91a1a}-6.86\%}$
test_compile_indexing[slice-tensordict-compile] 0.2434ms 0.1004ms 9.9632 KOps/s 10.1621 KOps/s $\color{#d91a1a}-1.96\%$
test_compile_indexing[slice-tensordict-eager] 0.1408ms 16.9706μs 58.9255 KOps/s 58.6836 KOps/s $\color{#35bf28}+0.41\%$
test_compile_indexing[slice-tensorclass-compile] 0.2640ms 98.9054μs 10.1107 KOps/s 10.5874 KOps/s $\color{#d91a1a}-4.50\%$
test_compile_indexing[slice-tensorclass-eager] 0.1647ms 16.4262μs 60.8784 KOps/s 64.7708 KOps/s $\textbf{\color{#d91a1a}-6.01\%}$
test_compile_indexing[slice-pytree-compile] 0.2850ms 98.6066μs 10.1413 KOps/s 10.5658 KOps/s $\color{#d91a1a}-4.02\%$
test_compile_indexing[slice-pytree-eager] 0.1126ms 15.8216μs 63.2046 KOps/s 64.9604 KOps/s $\color{#d91a1a}-2.70\%$
test_compile_indexing[int-tensordict-compile] 0.2804ms 0.1026ms 9.7506 KOps/s 10.0463 KOps/s $\color{#d91a1a}-2.94\%$
test_compile_indexing[int-tensordict-eager] 0.5518ms 16.7304μs 59.7715 KOps/s 59.1362 KOps/s $\color{#35bf28}+1.07\%$
test_compile_indexing[int-tensorclass-compile] 0.2520ms 96.1915μs 10.3959 KOps/s 10.4995 KOps/s $\color{#d91a1a}-0.99\%$
test_compile_indexing[int-tensorclass-eager] 96.3020μs 15.7474μs 63.5027 KOps/s 65.5357 KOps/s $\color{#d91a1a}-3.10\%$
test_compile_indexing[int-pytree-compile] 0.2233ms 95.9462μs 10.4225 KOps/s 10.5175 KOps/s $\color{#d91a1a}-0.90\%$
test_compile_indexing[int-pytree-eager] 92.8520μs 15.9271μs 62.7862 KOps/s 65.4563 KOps/s $\color{#d91a1a}-4.08\%$
test_mod_add[eager] 0.1873ms 37.5566μs 26.6265 KOps/s 26.0395 KOps/s $\color{#35bf28}+2.25\%$
test_mod_add[compile] 0.3390ms 78.4650μs 12.7445 KOps/s 12.5689 KOps/s $\color{#35bf28}+1.40\%$
test_mod_add[compile-overhead] 0.3216ms 0.1634ms 6.1183 KOps/s 5.7544 KOps/s $\textbf{\color{#35bf28}+6.33\%}$
test_mod_wrap[eager] 0.3778ms 0.2453ms 4.0772 KOps/s 3.9936 KOps/s $\color{#35bf28}+2.09\%$
test_mod_wrap[compile] 0.3974ms 0.2787ms 3.5879 KOps/s 3.5583 KOps/s $\color{#35bf28}+0.83\%$
test_mod_wrap[compile-overhead] 6.9500ms 3.7215ms 268.7102 Ops/s 265.9146 Ops/s $\color{#35bf28}+1.05\%$
test_mod_wrap_and_backward[eager] 1.5921ms 1.3996ms 714.4700 Ops/s 675.9173 Ops/s $\textbf{\color{#35bf28}+5.70\%}$
test_mod_wrap_and_backward[compile] 1.4128ms 1.2497ms 800.1712 Ops/s 730.0133 Ops/s $\textbf{\color{#35bf28}+9.61\%}$
test_mod_wrap_and_backward[compile-overhead] 1.4289ms 0.9250ms 1.0811 KOps/s 954.4387 Ops/s $\textbf{\color{#35bf28}+13.27\%}$
test_seq_add[eager] 0.3274ms 0.1185ms 8.4361 KOps/s 8.3344 KOps/s $\color{#35bf28}+1.22\%$
test_seq_add[compile] 0.2448ms 87.9137μs 11.3748 KOps/s 11.0080 KOps/s $\color{#35bf28}+3.33\%$
test_seq_add[compile-overhead] 0.3290ms 0.1273ms 7.8580 KOps/s 7.8081 KOps/s $\color{#35bf28}+0.64\%$
test_seq_wrap[eager] 0.6148ms 0.4167ms 2.4000 KOps/s 2.3664 KOps/s $\color{#35bf28}+1.42\%$
test_seq_wrap[compile] 0.4658ms 0.2959ms 3.3801 KOps/s 3.3650 KOps/s $\color{#35bf28}+0.45\%$
test_seq_wrap[compile-overhead] 0.3636ms 0.2222ms 4.5005 KOps/s 4.3119 KOps/s $\color{#35bf28}+4.37\%$
test_func_call_runtime[False-eager] 0.9297ms 0.7559ms 1.3229 KOps/s 1.2526 KOps/s $\textbf{\color{#35bf28}+5.61\%}$
test_func_call_runtime[False-compile] 0.9456ms 0.7267ms 1.3761 KOps/s 1.3612 KOps/s $\color{#35bf28}+1.09\%$
test_func_call_runtime[False-compile-overhead] 0.4165ms 0.3586ms 2.7889 KOps/s 2.7698 KOps/s $\color{#35bf28}+0.69\%$
test_func_call_runtime[True-eager] 1.0391ms 0.9003ms 1.1107 KOps/s 1.1193 KOps/s $\color{#d91a1a}-0.77\%$
test_func_call_runtime[True-compile] 0.9677ms 0.7552ms 1.3242 KOps/s 1.3310 KOps/s $\color{#d91a1a}-0.51\%$
test_func_call_runtime[True-compile-overhead] 0.5184ms 0.3778ms 2.6469 KOps/s 2.6199 KOps/s $\color{#35bf28}+1.03\%$
test_func_call_cm_runtime[False-eager] 0.8441ms 0.7175ms 1.3937 KOps/s 1.2641 KOps/s $\textbf{\color{#35bf28}+10.25\%}$
test_func_call_cm_runtime[False-compile] 0.9000ms 0.7343ms 1.3619 KOps/s 1.2523 KOps/s $\textbf{\color{#35bf28}+8.75\%}$
test_func_call_cm_runtime[False-compile-overhead] 0.5090ms 0.3622ms 2.7607 KOps/s 2.7358 KOps/s $\color{#35bf28}+0.91\%$
test_func_call_cm_runtime[True-eager] 1.0861ms 0.9852ms 1.0150 KOps/s 934.9338 Ops/s $\textbf{\color{#35bf28}+8.57\%}$
test_func_call_cm_runtime[True-compile] 0.9376ms 0.7807ms 1.2810 KOps/s 1.2645 KOps/s $\color{#35bf28}+1.31\%$
test_func_call_cm_runtime[True-compile-overhead] 0.5052ms 0.4084ms 2.4488 KOps/s 2.4433 KOps/s $\color{#35bf28}+0.23\%$
test_vmap_func_call_cm_runtime[eager] 2.6843ms 2.1460ms 465.9733 Ops/s 474.7593 Ops/s $\color{#d91a1a}-1.85\%$
test_vmap_func_call_cm_runtime[compile] 1.0389ms 0.8502ms 1.1762 KOps/s 1.1508 KOps/s $\color{#35bf28}+2.20\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5343ms 0.4069ms 2.4573 KOps/s 2.4490 KOps/s $\color{#35bf28}+0.34\%$
test_distributed 4.6210ms 0.2414ms 4.1428 KOps/s 8.1015 KOps/s $\textbf{\color{#d91a1a}-48.86\%}$
test_tdmodule 57.5410μs 20.4778μs 48.8334 KOps/s 50.1350 KOps/s $\color{#d91a1a}-2.60\%$
test_tdmodule_dispatch 78.1110μs 36.4521μs 27.4332 KOps/s 27.4947 KOps/s $\color{#d91a1a}-0.22\%$
test_tdseq 58.5310μs 21.0201μs 47.5734 KOps/s 46.7700 KOps/s $\color{#35bf28}+1.72\%$
test_tdseq_dispatch 69.6010μs 39.1402μs 25.5492 KOps/s 25.3246 KOps/s $\color{#35bf28}+0.89\%$
test_instantiation_functorch 1.6474ms 1.5321ms 652.7131 Ops/s 657.8127 Ops/s $\color{#d91a1a}-0.78\%$
test_exec_functorch 0.1996ms 0.1464ms 6.8286 KOps/s 7.0815 KOps/s $\color{#d91a1a}-3.57\%$
test_exec_functional_call 0.2803ms 0.1403ms 7.1297 KOps/s 7.3007 KOps/s $\color{#d91a1a}-2.34\%$
test_exec_td_decorator 0.3859ms 0.1869ms 5.3515 KOps/s 5.4101 KOps/s $\color{#d91a1a}-1.08\%$
test_vmap_mlp_speed_decorator[True-True] 0.8357ms 0.6811ms 1.4682 KOps/s 1.4658 KOps/s $\color{#35bf28}+0.17\%$
test_vmap_mlp_speed_decorator[True-False] 0.8126ms 0.6826ms 1.4651 KOps/s 1.4634 KOps/s $\color{#35bf28}+0.12\%$
test_vmap_mlp_speed_decorator[False-True] 0.7365ms 0.6180ms 1.6181 KOps/s 1.5969 KOps/s $\color{#35bf28}+1.33\%$
test_vmap_mlp_speed_decorator[False-False] 0.7747ms 0.6206ms 1.6114 KOps/s 1.5894 KOps/s $\color{#35bf28}+1.38\%$
test_vmap_transformer_speed_decorator[True-True] 19.1584ms 19.0321ms 52.5428 Ops/s 52.2756 Ops/s $\color{#35bf28}+0.51\%$
test_vmap_transformer_speed_decorator[True-False] 19.1677ms 19.0325ms 52.5416 Ops/s 51.8572 Ops/s $\color{#35bf28}+1.32\%$
test_vmap_transformer_speed_decorator[False-True] 19.7835ms 18.9602ms 52.7421 Ops/s 52.6647 Ops/s $\color{#35bf28}+0.15\%$
test_vmap_transformer_speed_decorator[False-False] 19.3346ms 18.9270ms 52.8345 Ops/s 52.6136 Ops/s $\color{#35bf28}+0.42\%$
test_to_module_speed[True] 1.6296ms 0.9638ms 1.0376 KOps/s 1.0408 KOps/s $\color{#d91a1a}-0.31\%$
test_to_module_speed[False] 1.4790ms 0.9439ms 1.0594 KOps/s 1.0500 KOps/s $\color{#35bf28}+0.90\%$
test_tc_init 74.2220μs 36.6187μs 27.3084 KOps/s 27.0592 KOps/s $\color{#35bf28}+0.92\%$
test_tc_init_nested 0.2272ms 75.4294μs 13.2574 KOps/s 13.6972 KOps/s $\color{#d91a1a}-3.21\%$
test_tc_first_layer_tensor 4.8900μs 0.6945μs 1.4400 MOps/s 1.2373 MOps/s $\textbf{\color{#35bf28}+16.38\%}$
test_tc_first_layer_nontensor 37.2200μs 2.2313μs 448.1684 KOps/s 444.6312 KOps/s $\color{#35bf28}+0.80\%$
test_tc_second_layer_tensor 7.6550μs 1.4085μs 709.9756 KOps/s 700.6774 KOps/s $\color{#35bf28}+1.33\%$
test_tc_second_layer_nontensor 40.6800μs 2.9624μs 337.5594 KOps/s 334.0619 KOps/s $\color{#35bf28}+1.05\%$
test_unbind 0.2292s 9.9001ms 101.0086 Ops/s 145.2471 Ops/s $\textbf{\color{#d91a1a}-30.46\%}$
test_full_like 10.5238ms 9.4334ms 106.0061 Ops/s 105.5727 Ops/s $\color{#35bf28}+0.41\%$
test_zeros_like 5.2722ms 4.3482ms 229.9820 Ops/s 233.6216 Ops/s $\color{#d91a1a}-1.56\%$
test_ones_like 5.3862ms 4.2632ms 234.5671 Ops/s 232.0524 Ops/s $\color{#35bf28}+1.08\%$
test_clone 7.0348ms 6.5907ms 151.7279 Ops/s 150.9094 Ops/s $\color{#35bf28}+0.54\%$
test_squeeze 0.1546ms 9.8031μs 102.0086 KOps/s 104.6865 KOps/s $\color{#d91a1a}-2.56\%$
test_unsqueeze 0.2628ms 73.5846μs 13.5898 KOps/s 13.3166 KOps/s $\color{#35bf28}+2.05\%$
test_split 0.4078ms 0.1693ms 5.9060 KOps/s 6.1728 KOps/s $\color{#d91a1a}-4.32\%$
test_permute 0.3642ms 0.1858ms 5.3823 KOps/s 5.4279 KOps/s $\color{#d91a1a}-0.84\%$
test_stack 51.6608ms 50.9426ms 19.6299 Ops/s 19.3348 Ops/s $\color{#35bf28}+1.53\%$
test_cat 51.5917ms 50.7316ms 19.7116 Ops/s 19.5790 Ops/s $\color{#35bf28}+0.68\%$

@vmoens vmoens force-pushed the composite_lp_aggregate branch from 8c1335e to 04e1c1d Compare January 14, 2025 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Deprecation Announces or enacts a deprecation Refactor Refactoring code - not a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants