Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] fix inline TDParams kwargs for nontensordata #1094

Merged
merged 1 commit into from
Nov 20, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Nov 20, 2024

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 20, 2024
ghstack-source-id: afd50385b6b1e8bd8ccfaabfa387ca5611ca07e2
Pull Request resolved: #1094
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 20, 2024
@vmoens vmoens merged commit 89a8a69 into gh/vmoens/35/base Nov 20, 2024
10 of 24 checks passed
vmoens added a commit that referenced this pull request Nov 20, 2024
ghstack-source-id: afd50385b6b1e8bd8ccfaabfa387ca5611ca07e2
Pull Request resolved: #1094
@vmoens vmoens deleted the gh/vmoens/35/head branch November 20, 2024 09:33
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}10$. Worsened: $\large\color{#d91a1a}19$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 37.0090μs 17.7520μs 56.3317 KOps/s 56.6728 KOps/s $\color{#d91a1a}-0.60\%$
test_plain_set_stack_nested 38.1610μs 17.9027μs 55.8576 KOps/s 56.0605 KOps/s $\color{#d91a1a}-0.36\%$
test_plain_set_nested_inplace 53.9710μs 19.6773μs 50.8199 KOps/s 51.4315 KOps/s $\color{#d91a1a}-1.19\%$
test_plain_set_stack_nested_inplace 46.7370μs 19.5239μs 51.2194 KOps/s 51.2376 KOps/s $\color{#d91a1a}-0.04\%$
test_items 27.7320μs 4.1555μs 240.6442 KOps/s 242.0542 KOps/s $\color{#d91a1a}-0.58\%$
test_items_nested 0.6553ms 0.3462ms 2.8887 KOps/s 2.9366 KOps/s $\color{#d91a1a}-1.63\%$
test_items_nested_locked 0.7224ms 0.3424ms 2.9207 KOps/s 2.9353 KOps/s $\color{#d91a1a}-0.50\%$
test_items_nested_leaf 0.1336ms 72.9782μs 13.7027 KOps/s 13.9955 KOps/s $\color{#d91a1a}-2.09\%$
test_items_stack_nested 0.6515ms 0.3451ms 2.8975 KOps/s 2.8984 KOps/s $\color{#d91a1a}-0.03\%$
test_items_stack_nested_leaf 0.1348ms 75.1140μs 13.3131 KOps/s 13.4566 KOps/s $\color{#d91a1a}-1.07\%$
test_items_stack_nested_locked 0.7787ms 0.3445ms 2.9028 KOps/s 2.9139 KOps/s $\color{#d91a1a}-0.38\%$
test_keys 25.8590μs 3.4889μs 286.6248 KOps/s 286.4449 KOps/s $\color{#35bf28}+0.06\%$
test_keys_nested 0.2405ms 0.1383ms 7.2301 KOps/s 7.2799 KOps/s $\color{#d91a1a}-0.68\%$
test_keys_nested_locked 1.7114ms 0.1451ms 6.8909 KOps/s 7.0037 KOps/s $\color{#d91a1a}-1.61\%$
test_keys_nested_leaf 0.2110ms 0.1196ms 8.3581 KOps/s 8.5132 KOps/s $\color{#d91a1a}-1.82\%$
test_keys_stack_nested 0.2597ms 0.1379ms 7.2491 KOps/s 7.2232 KOps/s $\color{#35bf28}+0.36\%$
test_keys_stack_nested_leaf 0.1871ms 0.1185ms 8.4413 KOps/s 8.5027 KOps/s $\color{#d91a1a}-0.72\%$
test_keys_stack_nested_locked 0.2597ms 0.1438ms 6.9523 KOps/s 7.0041 KOps/s $\color{#d91a1a}-0.74\%$
test_values 6.3798μs 1.0263μs 974.3459 KOps/s 937.3199 KOps/s $\color{#35bf28}+3.95\%$
test_values_nested 0.1111ms 55.4638μs 18.0298 KOps/s 17.8859 KOps/s $\color{#35bf28}+0.80\%$
test_values_nested_locked 0.1047ms 55.2826μs 18.0889 KOps/s 17.8541 KOps/s $\color{#35bf28}+1.32\%$
test_values_nested_leaf 0.1169ms 60.7958μs 16.4485 KOps/s 16.4375 KOps/s $\color{#35bf28}+0.07\%$
test_values_stack_nested 0.1032ms 56.4163μs 17.7254 KOps/s 17.3877 KOps/s $\color{#35bf28}+1.94\%$
test_values_stack_nested_leaf 0.1052ms 60.1247μs 16.6321 KOps/s 15.4188 KOps/s $\textbf{\color{#35bf28}+7.87\%}$
test_values_stack_nested_locked 0.1032ms 56.8328μs 17.5955 KOps/s 17.4397 KOps/s $\color{#35bf28}+0.89\%$
test_membership 4.1249μs 0.7183μs 1.3921 MOps/s 1.1267 MOps/s $\textbf{\color{#35bf28}+23.55\%}$
test_membership_nested 34.3040μs 2.7870μs 358.8136 KOps/s 371.3407 KOps/s $\color{#d91a1a}-3.37\%$
test_membership_nested_leaf 30.5280μs 2.7878μs 358.7104 KOps/s 370.0529 KOps/s $\color{#d91a1a}-3.07\%$
test_membership_stacked_nested 24.5560μs 2.7432μs 364.5338 KOps/s 364.6517 KOps/s $\color{#d91a1a}-0.03\%$
test_membership_stacked_nested_leaf 24.5460μs 2.7414μs 364.7750 KOps/s 363.4391 KOps/s $\color{#35bf28}+0.37\%$
test_membership_nested_last 32.3000μs 4.1245μs 242.4516 KOps/s 243.6994 KOps/s $\color{#d91a1a}-0.51\%$
test_membership_nested_leaf_last 27.0400μs 4.1569μs 240.5625 KOps/s 242.2147 KOps/s $\color{#d91a1a}-0.68\%$
test_membership_stacked_nested_last 39.4130μs 12.8748μs 77.6712 KOps/s 245.5783 KOps/s $\textbf{\color{#d91a1a}-68.37\%}$
test_membership_stacked_nested_leaf_last 34.9860μs 13.0322μs 76.7331 KOps/s 243.7735 KOps/s $\textbf{\color{#d91a1a}-68.52\%}$
test_nested_getleaf 34.0240μs 10.6295μs 94.0779 KOps/s 93.8108 KOps/s $\color{#35bf28}+0.28\%$
test_nested_get 36.8090μs 10.0393μs 99.6089 KOps/s 99.4074 KOps/s $\color{#35bf28}+0.20\%$
test_stacked_getleaf 35.4960μs 10.3345μs 96.7631 KOps/s 95.3637 KOps/s $\color{#35bf28}+1.47\%$
test_stacked_get 33.4420μs 9.9298μs 100.7069 KOps/s 101.0526 KOps/s $\color{#d91a1a}-0.34\%$
test_nested_getitemleaf 34.8650μs 11.1168μs 89.9543 KOps/s 90.9887 KOps/s $\color{#d91a1a}-1.14\%$
test_nested_getitem 35.5160μs 10.3914μs 96.2334 KOps/s 97.6253 KOps/s $\color{#d91a1a}-1.43\%$
test_stacked_getitemleaf 48.3310μs 10.9273μs 91.5141 KOps/s 92.5101 KOps/s $\color{#d91a1a}-1.08\%$
test_stacked_getitem 44.2520μs 10.1602μs 98.4231 KOps/s 97.6442 KOps/s $\color{#35bf28}+0.80\%$
test_lock_nested 2.9651ms 0.4515ms 2.2147 KOps/s 1.8672 KOps/s $\textbf{\color{#35bf28}+18.61\%}$
test_lock_stack_nested 0.6109ms 0.4035ms 2.4781 KOps/s 2.4462 KOps/s $\color{#35bf28}+1.30\%$
test_unlock_nested 0.6626ms 0.3614ms 2.7670 KOps/s 2.7622 KOps/s $\color{#35bf28}+0.17\%$
test_unlock_stack_nested 0.4618ms 0.3213ms 3.1120 KOps/s 3.0575 KOps/s $\color{#35bf28}+1.78\%$
test_flatten_speed 0.1635ms 94.4022μs 10.5930 KOps/s 10.8739 KOps/s $\color{#d91a1a}-2.58\%$
test_unflatten_speed 0.8478ms 0.4754ms 2.1036 KOps/s 2.1476 KOps/s $\color{#d91a1a}-2.05\%$
test_common_ops 1.4862ms 0.7544ms 1.3256 KOps/s 1.3152 KOps/s $\color{#35bf28}+0.79\%$
test_creation 0.3065ms 2.2547μs 443.5157 KOps/s 487.0012 KOps/s $\textbf{\color{#d91a1a}-8.93\%}$
test_creation_empty 37.4900μs 10.5392μs 94.8836 KOps/s 94.2695 KOps/s $\color{#35bf28}+0.65\%$
test_creation_nested_1 52.1880μs 13.2683μs 75.3674 KOps/s 73.4275 KOps/s $\color{#35bf28}+2.64\%$
test_creation_nested_2 56.6150μs 17.4336μs 57.3605 KOps/s 57.6704 KOps/s $\color{#d91a1a}-0.54\%$
test_clone 61.2050μs 13.5815μs 73.6298 KOps/s 74.6728 KOps/s $\color{#d91a1a}-1.40\%$
test_getitem[int] 1.4190ms 13.0664μs 76.5319 KOps/s 79.5977 KOps/s $\color{#d91a1a}-3.85\%$
test_getitem[slice_int] 0.1464ms 25.9475μs 38.5394 KOps/s 42.3380 KOps/s $\textbf{\color{#d91a1a}-8.97\%}$
test_getitem[range] 0.1675ms 48.4966μs 20.6200 KOps/s 20.9092 KOps/s $\color{#d91a1a}-1.38\%$
test_getitem[tuple] 0.1323ms 20.4365μs 48.9320 KOps/s 51.1198 KOps/s $\color{#d91a1a}-4.28\%$
test_getitem[list] 0.1772ms 43.4034μs 23.0397 KOps/s 23.5433 KOps/s $\color{#d91a1a}-2.14\%$
test_setitem_dim[int] 46.9580μs 26.3973μs 37.8826 KOps/s 40.9015 KOps/s $\textbf{\color{#d91a1a}-7.38\%}$
test_setitem_dim[slice_int] 91.4800μs 52.2595μs 19.1353 KOps/s 20.1846 KOps/s $\textbf{\color{#d91a1a}-5.20\%}$
test_setitem_dim[range] 0.1238ms 74.3541μs 13.4492 KOps/s 13.6913 KOps/s $\color{#d91a1a}-1.77\%$
test_setitem_dim[tuple] 61.8760μs 41.7388μs 23.9585 KOps/s 25.4084 KOps/s $\textbf{\color{#d91a1a}-5.71\%}$
test_setitem 64.1910μs 20.8273μs 48.0140 KOps/s 50.2506 KOps/s $\color{#d91a1a}-4.45\%$
test_set 79.0780μs 20.3530μs 49.1329 KOps/s 51.5550 KOps/s $\color{#d91a1a}-4.70\%$
test_set_shared 1.1704ms 0.1658ms 6.0300 KOps/s 5.9534 KOps/s $\color{#35bf28}+1.29\%$
test_update 0.1717ms 22.7938μs 43.8715 KOps/s 46.0754 KOps/s $\color{#d91a1a}-4.78\%$
test_update_nested 0.1341ms 33.0968μs 30.2144 KOps/s 31.9383 KOps/s $\textbf{\color{#d91a1a}-5.40\%}$
test_update__nested 0.5145ms 34.0952μs 29.3296 KOps/s 31.5758 KOps/s $\textbf{\color{#d91a1a}-7.11\%}$
test_set_nested 78.3270μs 22.1161μs 45.2160 KOps/s 47.1209 KOps/s $\color{#d91a1a}-4.04\%$
test_set_nested_new 77.7450μs 26.5875μs 37.6117 KOps/s 38.6688 KOps/s $\color{#d91a1a}-2.73\%$
test_select 0.1035ms 43.0398μs 23.2343 KOps/s 24.3109 KOps/s $\color{#d91a1a}-4.43\%$
test_select_nested 0.1306ms 60.2383μs 16.6007 KOps/s 16.9008 KOps/s $\color{#d91a1a}-1.78\%$
test_exclude_nested 0.1481ms 75.9512μs 13.1663 KOps/s 13.4875 KOps/s $\color{#d91a1a}-2.38\%$
test_empty[True] 0.6671ms 0.3580ms 2.7934 KOps/s 2.8680 KOps/s $\color{#d91a1a}-2.60\%$
test_empty[False] 6.8678μs 1.2547μs 796.9910 KOps/s 751.9772 KOps/s $\textbf{\color{#35bf28}+5.99\%}$
test_unbind_speed 0.4536ms 0.2657ms 3.7638 KOps/s 3.8766 KOps/s $\color{#d91a1a}-2.91\%$
test_unbind_speed_stack0 0.4743ms 0.2534ms 3.9464 KOps/s 3.8842 KOps/s $\color{#35bf28}+1.60\%$
test_unbind_speed_stack1 0.1018s 0.7415ms 1.3486 KOps/s 1.4292 KOps/s $\textbf{\color{#d91a1a}-5.64\%}$
test_split 97.2564ms 1.7557ms 569.5803 Ops/s 585.0566 Ops/s $\color{#d91a1a}-2.65\%$
test_chunk 97.6962ms 1.7811ms 561.4586 Ops/s 584.3735 Ops/s $\color{#d91a1a}-3.92\%$
test_consolidate_njt[False-None] 8.8027ms 8.3389ms 119.9203 Ops/s 122.8384 Ops/s $\color{#d91a1a}-2.38\%$
test_creation[device0] 0.2325ms 90.9110μs 10.9998 KOps/s 11.0203 KOps/s $\color{#d91a1a}-0.19\%$
test_creation_from_tensor 4.1418ms 93.8475μs 10.6556 KOps/s 10.4600 KOps/s $\color{#35bf28}+1.87\%$
test_add_one[memmap_tensor0] 0.1794ms 4.8428μs 206.4930 KOps/s 201.7856 KOps/s $\color{#35bf28}+2.33\%$
test_contiguous[memmap_tensor0] 16.6110μs 0.5236μs 1.9100 MOps/s 1.8413 MOps/s $\color{#35bf28}+3.73\%$
test_stack[memmap_tensor0] 34.7950μs 3.3854μs 295.3837 KOps/s 295.3992 KOps/s $-0.01\%$
test_memmaptd_index 1.1420ms 0.2406ms 4.1561 KOps/s 4.2184 KOps/s $\color{#d91a1a}-1.48\%$
test_memmaptd_index_astensor 0.5697ms 0.3106ms 3.2191 KOps/s 3.1817 KOps/s $\color{#35bf28}+1.18\%$
test_memmaptd_index_op 1.1017ms 0.5860ms 1.7065 KOps/s 1.7120 KOps/s $\color{#d91a1a}-0.32\%$
test_serialize_model 0.1263s 0.1133s 8.8286 Ops/s 7.6138 Ops/s $\textbf{\color{#35bf28}+15.95\%}$
test_serialize_model_pickle 0.4485s 0.3850s 2.5972 Ops/s 2.6046 Ops/s $\color{#d91a1a}-0.28\%$
test_serialize_weights 0.2206s 0.1260s 7.9381 Ops/s 8.9509 Ops/s $\textbf{\color{#d91a1a}-11.31\%}$
test_serialize_weights_returnearly 0.1708s 0.1558s 6.4188 Ops/s 6.4509 Ops/s $\color{#d91a1a}-0.50\%$
test_serialize_weights_pickle 0.4639s 0.3932s 2.5431 Ops/s 2.3790 Ops/s $\textbf{\color{#35bf28}+6.90\%}$
test_serialize_weights_filesystem 0.1457s 0.1390s 7.1953 Ops/s 7.0681 Ops/s $\color{#35bf28}+1.80\%$
test_serialize_model_filesystem 0.1590s 0.1485s 6.7344 Ops/s 6.6834 Ops/s $\color{#35bf28}+0.76\%$
test_reshape_pytree 59.5620μs 26.8048μs 37.3067 KOps/s 38.2093 KOps/s $\color{#d91a1a}-2.36\%$
test_reshape_td 71.8240μs 33.1685μs 30.1491 KOps/s 31.7874 KOps/s $\textbf{\color{#d91a1a}-5.15\%}$
test_view_pytree 77.8750μs 27.2072μs 36.7550 KOps/s 38.0645 KOps/s $\color{#d91a1a}-3.44\%$
test_view_td 93.2040μs 39.1165μs 25.5646 KOps/s 26.8961 KOps/s $\color{#d91a1a}-4.95\%$
test_unbind_pytree 71.7640μs 30.0847μs 33.2395 KOps/s 34.1562 KOps/s $\color{#d91a1a}-2.68\%$
test_unbind_td 0.3371ms 38.8122μs 25.7651 KOps/s 26.6111 KOps/s $\color{#d91a1a}-3.18\%$
test_split_pytree 64.5300μs 30.0912μs 33.2324 KOps/s 34.0020 KOps/s $\color{#d91a1a}-2.26\%$
test_split_td 98.8497ms 53.8630μs 18.5656 KOps/s 23.0271 KOps/s $\textbf{\color{#d91a1a}-19.37\%}$
test_add_pytree 0.1096ms 35.6266μs 28.0689 KOps/s 28.2344 KOps/s $\color{#d91a1a}-0.59\%$
test_add_td 0.1158ms 57.1013μs 17.5127 KOps/s 18.2684 KOps/s $\color{#d91a1a}-4.14\%$
test_compile_add_one_nested[tensordict-compile] 0.1238ms 60.6255μs 16.4947 KOps/s 16.0904 KOps/s $\color{#35bf28}+2.51\%$
test_compile_add_one_nested[tensordict-eager] 0.3498ms 0.1616ms 6.1900 KOps/s 6.2146 KOps/s $\color{#d91a1a}-0.40\%$
test_compile_add_one_nested[pytree-compile] 0.1101ms 44.9390μs 22.2524 KOps/s 21.7970 KOps/s $\color{#35bf28}+2.09\%$
test_compile_add_one_nested[pytree-eager] 0.2552ms 0.1206ms 8.2910 KOps/s 8.4459 KOps/s $\color{#d91a1a}-1.83\%$
test_compile_copy_nested[tensordict-compile] 77.3650μs 25.7046μs 38.9036 KOps/s 38.2603 KOps/s $\color{#35bf28}+1.68\%$
test_compile_copy_nested[tensordict-eager] 0.1390ms 54.1371μs 18.4716 KOps/s 18.7861 KOps/s $\color{#d91a1a}-1.67\%$
test_compile_copy_nested[pytree-compile] 0.1625ms 78.5990μs 12.7228 KOps/s 12.9707 KOps/s $\color{#d91a1a}-1.91\%$
test_compile_copy_nested[pytree-eager] 0.1866ms 67.7762μs 14.7544 KOps/s 14.8381 KOps/s $\color{#d91a1a}-0.56\%$
test_compile_add_one_flat[tensordict-compile] 0.6675ms 0.1171ms 8.5419 KOps/s 9.5116 KOps/s $\textbf{\color{#d91a1a}-10.19\%}$
test_compile_add_one_flat[tensordict-eager] 0.3156ms 0.2002ms 4.9952 KOps/s 5.0237 KOps/s $\color{#d91a1a}-0.57\%$
test_compile_add_one_flat[tensorclass-compile] 0.1139ms 44.9107μs 22.2664 KOps/s 22.2148 KOps/s $\color{#35bf28}+0.23\%$
test_compile_add_one_flat[tensorclass-eager] 0.4625ms 63.0909μs 15.8501 KOps/s 16.0873 KOps/s $\color{#d91a1a}-1.47\%$
test_compile_add_one_flat[pytree-compile] 0.1747ms 0.1029ms 9.7218 KOps/s 9.8336 KOps/s $\color{#d91a1a}-1.14\%$
test_compile_add_one_flat[pytree-eager] 0.3695ms 0.2013ms 4.9668 KOps/s 4.9718 KOps/s $\color{#d91a1a}-0.10\%$
test_compile_add_self_flat[tensordict-eager] 0.4119ms 0.2111ms 4.7371 KOps/s 4.7699 KOps/s $\color{#d91a1a}-0.69\%$
test_compile_add_self_flat[tensordict-compile] 0.1742ms 0.1043ms 9.5916 KOps/s 9.5945 KOps/s $\color{#d91a1a}-0.03\%$
test_compile_add_self_flat[tensorclass-eager] 0.1253ms 56.1689μs 17.8034 KOps/s 18.4749 KOps/s $\color{#d91a1a}-3.63\%$
test_compile_add_self_flat[tensorclass-compile] 0.1006ms 47.0802μs 21.2403 KOps/s 21.7492 KOps/s $\color{#d91a1a}-2.34\%$
test_compile_add_self_flat[pytree-eager] 5.2487ms 0.1615ms 6.1934 KOps/s 6.2746 KOps/s $\color{#d91a1a}-1.29\%$
test_compile_add_self_flat[pytree-compile] 0.1560ms 0.1033ms 9.6841 KOps/s 9.7020 KOps/s $\color{#d91a1a}-0.18\%$
test_compile_copy_flat[tensordict-compile] 61.5550μs 21.1617μs 47.2553 KOps/s 47.2371 KOps/s $\color{#35bf28}+0.04\%$
test_compile_copy_flat[tensordict-eager] 0.1169ms 59.0804μs 16.9261 KOps/s 17.2456 KOps/s $\color{#d91a1a}-1.85\%$
test_compile_copy_flat[pytree-compile] 0.1868ms 81.4539μs 12.2769 KOps/s 12.4224 KOps/s $\color{#d91a1a}-1.17\%$
test_compile_copy_flat[pytree-eager] 0.1598ms 68.8312μs 14.5283 KOps/s 14.7390 KOps/s $\color{#d91a1a}-1.43\%$
test_compile_assign_and_add[tensordict-compile] 0.4090ms 0.2084ms 4.7979 KOps/s 4.8527 KOps/s $\color{#d91a1a}-1.13\%$
test_compile_assign_and_add[tensordict-eager] 1.5040ms 1.2917ms 774.1574 Ops/s 790.0747 Ops/s $\color{#d91a1a}-2.01\%$
test_compile_assign_and_add[pytree-compile] 0.2676ms 0.2025ms 4.9383 KOps/s 5.0028 KOps/s $\color{#d91a1a}-1.29\%$
test_compile_assign_and_add[pytree-eager] 1.2210ms 0.7779ms 1.2854 KOps/s 1.2783 KOps/s $\color{#35bf28}+0.56\%$
test_compile_assign_and_add_stack[compile] 0.7915ms 0.4591ms 2.1780 KOps/s 2.2224 KOps/s $\color{#d91a1a}-2.00\%$
test_compile_assign_and_add_stack[eager] 3.1473ms 2.6895ms 371.8224 Ops/s 398.0245 Ops/s $\textbf{\color{#d91a1a}-6.58\%}$
test_compile_indexing[tensor-tensordict-compile] 81.2720μs 36.9325μs 27.0764 KOps/s 27.3716 KOps/s $\color{#d91a1a}-1.08\%$
test_compile_indexing[tensor-tensordict-eager] 0.5251ms 32.3253μs 30.9355 KOps/s 30.2889 KOps/s $\color{#35bf28}+2.13\%$
test_compile_indexing[tensor-tensorclass-compile] 66.6340μs 29.0325μs 34.4441 KOps/s 33.5857 KOps/s $\color{#35bf28}+2.56\%$
test_compile_indexing[tensor-tensorclass-eager] 67.7160μs 23.4552μs 42.6345 KOps/s 42.0748 KOps/s $\color{#35bf28}+1.33\%$
test_compile_indexing[tensor-pytree-compile] 85.3900μs 30.3061μs 32.9966 KOps/s 32.7479 KOps/s $\color{#35bf28}+0.76\%$
test_compile_indexing[tensor-pytree-eager] 61.9360μs 23.2403μs 43.0286 KOps/s 43.2489 KOps/s $\color{#d91a1a}-0.51\%$
test_compile_indexing[slice-tensordict-compile] 0.1147ms 51.3574μs 19.4714 KOps/s 19.6150 KOps/s $\color{#d91a1a}-0.73\%$
test_compile_indexing[slice-tensordict-eager] 0.5542ms 19.6532μs 50.8823 KOps/s 51.1257 KOps/s $\color{#d91a1a}-0.48\%$
test_compile_indexing[slice-tensorclass-compile] 0.1093ms 43.9207μs 22.7683 KOps/s 22.9335 KOps/s $\color{#d91a1a}-0.72\%$
test_compile_indexing[slice-tensorclass-eager] 47.1880μs 19.2276μs 52.0086 KOps/s 53.3233 KOps/s $\color{#d91a1a}-2.47\%$
test_compile_indexing[slice-pytree-compile] 0.1061ms 45.1567μs 22.1451 KOps/s 22.4159 KOps/s $\color{#d91a1a}-1.21\%$
test_compile_indexing[slice-pytree-eager] 52.7980μs 19.1991μs 52.0859 KOps/s 53.3309 KOps/s $\color{#d91a1a}-2.33\%$
test_compile_indexing[int-tensordict-compile] 0.1105ms 53.4919μs 18.6944 KOps/s 19.0999 KOps/s $\color{#d91a1a}-2.12\%$
test_compile_indexing[int-tensordict-eager] 0.9614ms 19.8197μs 50.4548 KOps/s 51.7014 KOps/s $\color{#d91a1a}-2.41\%$
test_compile_indexing[int-tensorclass-compile] 96.0800μs 45.0097μs 22.2174 KOps/s 22.4351 KOps/s $\color{#d91a1a}-0.97\%$
test_compile_indexing[int-tensorclass-eager] 58.4700μs 19.2800μs 51.8671 KOps/s 53.7606 KOps/s $\color{#d91a1a}-3.52\%$
test_compile_indexing[int-pytree-compile] 94.5570μs 45.3578μs 22.0469 KOps/s 22.6201 KOps/s $\color{#d91a1a}-2.53\%$
test_compile_indexing[int-pytree-eager] 58.3490μs 19.0509μs 52.4910 KOps/s 53.8578 KOps/s $\color{#d91a1a}-2.54\%$
test_mod_add[eager] 75.5920μs 27.3978μs 36.4992 KOps/s 38.3044 KOps/s $\color{#d91a1a}-4.71\%$
test_mod_add[compile] 90.5990μs 44.9624μs 22.2408 KOps/s 22.3350 KOps/s $\color{#d91a1a}-0.42\%$
test_mod_add[compile-overhead] 87.6640μs 45.0833μs 22.1811 KOps/s 22.4350 KOps/s $\color{#d91a1a}-1.13\%$
test_mod_wrap[eager] 0.6719ms 0.2157ms 4.6369 KOps/s 4.6332 KOps/s $\color{#35bf28}+0.08\%$
test_mod_wrap[compile] 1.2886ms 0.2037ms 4.9103 KOps/s 4.8632 KOps/s $\color{#35bf28}+0.97\%$
test_mod_wrap[compile-overhead] 1.6626ms 0.2030ms 4.9269 KOps/s 4.8199 KOps/s $\color{#35bf28}+2.22\%$
test_mod_wrap_and_backward[eager] 11.8140ms 10.8343ms 92.2996 Ops/s 91.7444 Ops/s $\color{#35bf28}+0.61\%$
test_mod_wrap_and_backward[compile] 12.4797ms 10.7247ms 93.2427 Ops/s 94.3835 Ops/s $\color{#d91a1a}-1.21\%$
test_mod_wrap_and_backward[compile-overhead] 11.9697ms 10.6441ms 93.9487 Ops/s 95.0709 Ops/s $\color{#d91a1a}-1.18\%$
test_seq_add[eager] 0.2001ms 94.0605μs 10.6315 KOps/s 10.9676 KOps/s $\color{#d91a1a}-3.06\%$
test_seq_add[compile] 0.1199ms 60.2075μs 16.6092 KOps/s 16.7392 KOps/s $\color{#d91a1a}-0.78\%$
test_seq_add[compile-overhead] 0.1115ms 58.2277μs 17.1740 KOps/s 16.9321 KOps/s $\color{#35bf28}+1.43\%$
test_seq_wrap[eager] 0.5259ms 0.3873ms 2.5820 KOps/s 2.5365 KOps/s $\color{#35bf28}+1.79\%$
test_seq_wrap[compile] 0.3517ms 0.2282ms 4.3819 KOps/s 4.3120 KOps/s $\color{#35bf28}+1.62\%$
test_seq_wrap[compile-overhead] 0.4314ms 0.2242ms 4.4598 KOps/s 4.3856 KOps/s $\color{#35bf28}+1.69\%$
test_func_call_runtime[False-eager] 0.9538ms 0.5491ms 1.8213 KOps/s 1.7847 KOps/s $\color{#35bf28}+2.05\%$
test_func_call_runtime[False-compile] 0.5682ms 0.4245ms 2.3556 KOps/s 2.3282 KOps/s $\color{#35bf28}+1.18\%$
test_func_call_runtime[False-compile-overhead] 0.7560ms 0.4268ms 2.3432 KOps/s 2.3184 KOps/s $\color{#35bf28}+1.07\%$
test_func_call_runtime[True-eager] 0.8700ms 0.7499ms 1.3335 KOps/s 1.2705 KOps/s $\color{#35bf28}+4.96\%$
test_func_call_runtime[True-compile] 0.9526ms 0.4657ms 2.1473 KOps/s 2.1399 KOps/s $\color{#35bf28}+0.35\%$
test_func_call_runtime[True-compile-overhead] 0.8449ms 0.4674ms 2.1393 KOps/s 2.1493 KOps/s $\color{#d91a1a}-0.46\%$
test_func_call_cm_runtime[False-eager] 0.8949ms 0.5436ms 1.8397 KOps/s 1.7506 KOps/s $\textbf{\color{#35bf28}+5.09\%}$
test_func_call_cm_runtime[False-compile] 0.8564ms 0.4240ms 2.3586 KOps/s 2.3458 KOps/s $\color{#35bf28}+0.54\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5842ms 0.4241ms 2.3580 KOps/s 2.3408 KOps/s $\color{#35bf28}+0.73\%$
test_func_call_cm_runtime[True-eager] 1.1145ms 0.8948ms 1.1176 KOps/s 1.1175 KOps/s $\color{#35bf28}+0.01\%$
test_func_call_cm_runtime[True-compile] 0.6285ms 0.4949ms 2.0205 KOps/s 2.0253 KOps/s $\color{#d91a1a}-0.23\%$
test_func_call_cm_runtime[True-compile-overhead] 0.8938ms 0.4957ms 2.0174 KOps/s 2.0300 KOps/s $\color{#d91a1a}-0.63\%$
test_vmap_func_call_cm_runtime[eager] 2.3532ms 1.8544ms 539.2523 Ops/s 530.3236 Ops/s $\color{#35bf28}+1.68\%$
test_vmap_func_call_cm_runtime[compile] 0.7101ms 0.5214ms 1.9179 KOps/s 1.9052 KOps/s $\color{#35bf28}+0.67\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.7017ms 0.5204ms 1.9217 KOps/s 1.9209 KOps/s $\color{#35bf28}+0.04\%$
test_distributed 0.9613ms 0.1326ms 7.5422 KOps/s 7.8528 KOps/s $\color{#d91a1a}-3.96\%$
test_tdmodule 77.7450μs 18.2473μs 54.8026 KOps/s 54.9165 KOps/s $\color{#d91a1a}-0.21\%$
test_tdmodule_dispatch 51.7170μs 36.0248μs 27.7586 KOps/s 28.1750 KOps/s $\color{#d91a1a}-1.48\%$
test_tdseq 46.3170μs 21.0472μs 47.5123 KOps/s 48.0856 KOps/s $\color{#d91a1a}-1.19\%$
test_tdseq_dispatch 82.9550μs 44.5254μs 22.4591 KOps/s 24.4141 KOps/s $\textbf{\color{#d91a1a}-8.01\%}$
test_instantiation_functorch 2.2950ms 1.5506ms 644.9249 Ops/s 662.5899 Ops/s $\color{#d91a1a}-2.67\%$
test_exec_functorch 0.2864ms 0.1822ms 5.4883 KOps/s 5.5071 KOps/s $\color{#d91a1a}-0.34\%$
test_exec_functional_call 0.3257ms 0.1717ms 5.8246 KOps/s 5.5751 KOps/s $\color{#35bf28}+4.48\%$
test_exec_td_decorator 0.5915ms 0.2296ms 4.3546 KOps/s 4.3362 KOps/s $\color{#35bf28}+0.42\%$
test_vmap_mlp_speed_decorator[True-True] 1.6151ms 0.6292ms 1.5893 KOps/s 1.5013 KOps/s $\textbf{\color{#35bf28}+5.86\%}$
test_vmap_mlp_speed_decorator[True-False] 0.9672ms 0.6295ms 1.5885 KOps/s 1.5620 KOps/s $\color{#35bf28}+1.69\%$
test_vmap_mlp_speed_decorator[False-True] 0.9593ms 0.5154ms 1.9401 KOps/s 1.8995 KOps/s $\color{#35bf28}+2.14\%$
test_vmap_mlp_speed_decorator[False-False] 0.8241ms 0.5134ms 1.9478 KOps/s 1.9004 KOps/s $\color{#35bf28}+2.49\%$
test_to_module_speed[True] 1.5041ms 1.2925ms 773.7018 Ops/s 776.4188 Ops/s $\color{#d91a1a}-0.35\%$
test_to_module_speed[False] 2.0059ms 1.2697ms 787.5990 Ops/s 798.3712 Ops/s $\color{#d91a1a}-1.35\%$
test_tc_init 79.4280μs 43.7530μs 22.8556 KOps/s 22.9349 KOps/s $\color{#d91a1a}-0.35\%$
test_tc_init_nested 0.1518ms 88.8465μs 11.2554 KOps/s 11.3727 KOps/s $\color{#d91a1a}-1.03\%$
test_tc_first_layer_tensor 24.5160μs 1.5596μs 641.1780 KOps/s 665.0485 KOps/s $\color{#d91a1a}-3.59\%$
test_tc_first_layer_nontensor 50.4140μs 4.8054μs 208.0988 KOps/s 214.1273 KOps/s $\color{#d91a1a}-2.82\%$
test_tc_second_layer_tensor 47.0580μs 2.8449μs 351.5053 KOps/s 359.0215 KOps/s $\color{#d91a1a}-2.09\%$
test_tc_second_layer_nontensor 61.4250μs 6.1283μs 163.1780 KOps/s 166.1986 KOps/s $\color{#d91a1a}-1.82\%$
test_unbind 0.2249s 12.4908ms 80.0591 Ops/s 87.3822 Ops/s $\textbf{\color{#d91a1a}-8.38\%}$
test_full_like 7.9489ms 7.1517ms 139.8260 Ops/s 88.4933 Ops/s $\textbf{\color{#35bf28}+58.01\%}$
test_zeros_like 12.9053ms 6.3844ms 156.6308 Ops/s 125.2382 Ops/s $\textbf{\color{#35bf28}+25.07\%}$
test_ones_like 13.0506ms 7.5905ms 131.7440 Ops/s 125.8271 Ops/s $\color{#35bf28}+4.70\%$
test_clone 13.8977ms 9.2874ms 107.6732 Ops/s 103.5734 Ops/s $\color{#35bf28}+3.96\%$
test_squeeze 58.5190μs 12.3699μs 80.8416 KOps/s 85.1683 KOps/s $\textbf{\color{#d91a1a}-5.08\%}$
test_unsqueeze 0.1625ms 89.4930μs 11.1741 KOps/s 11.6061 KOps/s $\color{#d91a1a}-3.72\%$
test_split 0.5166ms 0.1929ms 5.1842 KOps/s 5.4756 KOps/s $\textbf{\color{#d91a1a}-5.32\%}$
test_permute 0.4109ms 0.2179ms 4.5895 KOps/s 4.6264 KOps/s $\color{#d91a1a}-0.80\%$
test_stack 30.4619ms 25.2680ms 39.5757 Ops/s 39.2565 Ops/s $\color{#35bf28}+0.81\%$
test_cat 33.3582ms 25.5577ms 39.1271 Ops/s 40.6098 Ops/s $\color{#d91a1a}-3.65\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants