-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Subclass conservation in td ops #1186
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Jan 16, 2025
ghstack-source-id: 2990d5539699d53cf5e6c23950d6fcfbbe30dea8 Pull Request resolved: #1186
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jan 16, 2025
3 tasks
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 75.8620μs | 19.2901μs | 51.8402 KOps/s | 48.7280 KOps/s | |
test_plain_set_stack_nested | 53.9110μs | 19.4405μs | 51.4389 KOps/s | 48.1336 KOps/s | |
test_plain_set_nested_inplace | 83.9870μs | 21.2443μs | 47.0714 KOps/s | 44.5286 KOps/s | |
test_plain_set_stack_nested_inplace | 53.0090μs | 21.1288μs | 47.3288 KOps/s | 44.3081 KOps/s | |
test_items | 44.5730μs | 4.1749μs | 239.5271 KOps/s | 243.4163 KOps/s | |
test_items_nested | 0.6868ms | 0.3944ms | 2.5358 KOps/s | 2.5407 KOps/s | |
test_items_nested_locked | 0.8076ms | 0.3955ms | 2.5287 KOps/s | 2.4945 KOps/s | |
test_items_nested_leaf | 0.1808ms | 76.3298μs | 13.1010 KOps/s | 12.9339 KOps/s | |
test_items_stack_nested | 0.6842ms | 0.3968ms | 2.5203 KOps/s | 2.4985 KOps/s | |
test_items_stack_nested_leaf | 0.1476ms | 80.5904μs | 12.4084 KOps/s | 12.5911 KOps/s | |
test_items_stack_nested_locked | 0.5746ms | 0.3973ms | 2.5168 KOps/s | 2.5080 KOps/s | |
test_keys | 41.7480μs | 3.6148μs | 276.6369 KOps/s | 281.9563 KOps/s | |
test_keys_nested | 0.3431ms | 0.1652ms | 6.0520 KOps/s | 6.0619 KOps/s | |
test_keys_nested_locked | 1.5809ms | 0.1690ms | 5.9166 KOps/s | 5.8109 KOps/s | |
test_keys_nested_leaf | 0.2125ms | 0.1423ms | 7.0285 KOps/s | 6.8054 KOps/s | |
test_keys_stack_nested | 0.2460ms | 0.1621ms | 6.1680 KOps/s | 5.9053 KOps/s | |
test_keys_stack_nested_leaf | 0.2577ms | 0.1381ms | 7.2421 KOps/s | 6.8006 KOps/s | |
test_keys_stack_nested_locked | 0.3070ms | 0.1669ms | 5.9923 KOps/s | 5.7725 KOps/s | |
test_values | 8.2012μs | 1.0339μs | 967.2092 KOps/s | 935.1485 KOps/s | |
test_values_nested | 0.1055ms | 62.1699μs | 16.0850 KOps/s | 15.9309 KOps/s | |
test_values_nested_locked | 0.1234ms | 61.6143μs | 16.2300 KOps/s | 16.1425 KOps/s | |
test_values_nested_leaf | 0.1376ms | 71.6461μs | 13.9575 KOps/s | 13.9544 KOps/s | |
test_values_stack_nested | 0.1104ms | 63.7615μs | 15.6834 KOps/s | 15.5741 KOps/s | |
test_values_stack_nested_leaf | 0.1273ms | 70.4078μs | 14.2030 KOps/s | 14.0395 KOps/s | |
test_values_stack_nested_locked | 0.1143ms | 63.8120μs | 15.6710 KOps/s | 16.0617 KOps/s | |
test_membership | 39.7640μs | 0.8512μs | 1.1749 MOps/s | 1.1701 MOps/s | |
test_membership_nested | 22.0710μs | 2.9121μs | 343.3908 KOps/s | 345.1291 KOps/s | |
test_membership_nested_leaf | 25.5180μs | 2.9135μs | 343.2312 KOps/s | 346.5557 KOps/s | |
test_membership_stacked_nested | 18.0240μs | 2.8925μs | 345.7206 KOps/s | 352.1376 KOps/s | |
test_membership_stacked_nested_leaf | 42.0780μs | 2.8887μs | 346.1737 KOps/s | 346.9598 KOps/s | |
test_membership_nested_last | 27.2510μs | 4.3208μs | 231.4369 KOps/s | 230.9357 KOps/s | |
test_membership_nested_leaf_last | 50.4240μs | 4.3017μs | 232.4658 KOps/s | 227.4073 KOps/s | |
test_membership_stacked_nested_last | 28.2620μs | 5.4755μs | 182.6309 KOps/s | 231.7716 KOps/s | |
test_membership_stacked_nested_leaf_last | 38.1710μs | 5.5448μs | 180.3494 KOps/s | 231.0055 KOps/s | |
test_nested_getleaf | 31.4680μs | 10.4641μs | 95.5645 KOps/s | 93.5325 KOps/s | |
test_nested_get | 32.0390μs | 10.0610μs | 99.3936 KOps/s | 97.6411 KOps/s | |
test_stacked_getleaf | 47.9700μs | 10.5339μs | 94.9320 KOps/s | 94.6736 KOps/s | |
test_stacked_get | 32.0200μs | 9.9799μs | 100.2019 KOps/s | 97.9327 KOps/s | |
test_nested_getitemleaf | 36.8290μs | 11.1610μs | 89.5977 KOps/s | 89.1769 KOps/s | |
test_nested_getitem | 61.7350μs | 10.6077μs | 94.2711 KOps/s | 93.4859 KOps/s | |
test_stacked_getitemleaf | 40.5760μs | 11.2011μs | 89.2768 KOps/s | 88.3459 KOps/s | |
test_stacked_getitem | 33.0120μs | 10.4946μs | 95.2869 KOps/s | 93.5732 KOps/s | |
test_lock_nested | 2.8349ms | 0.4451ms | 2.2465 KOps/s | 1.7956 KOps/s | |
test_lock_stack_nested | 0.5510ms | 0.4146ms | 2.4119 KOps/s | 2.3685 KOps/s | |
test_unlock_nested | 0.7380ms | 0.3644ms | 2.7443 KOps/s | 2.6430 KOps/s | |
test_unlock_stack_nested | 0.5265ms | 0.3354ms | 2.9812 KOps/s | 2.8671 KOps/s | |
test_flatten_speed | 0.1863ms | 0.1005ms | 9.9489 KOps/s | 10.0295 KOps/s | |
test_unflatten_speed | 0.9152ms | 0.5209ms | 1.9198 KOps/s | 1.9178 KOps/s | |
test_common_ops | 4.1285ms | 0.7250ms | 1.3792 KOps/s | 1.2448 KOps/s | |
test_creation | 68.1070μs | 2.4244μs | 412.4793 KOps/s | 407.1937 KOps/s | |
test_creation_empty | 31.5390μs | 9.2298μs | 108.3446 KOps/s | 82.9872 KOps/s | |
test_creation_nested_1 | 44.2620μs | 12.0716μs | 82.8393 KOps/s | 68.0355 KOps/s | |
test_creation_nested_2 | 50.8140μs | 16.4035μs | 60.9627 KOps/s | 51.5717 KOps/s | |
test_clone | 62.6570μs | 13.2463μs | 75.4927 KOps/s | 75.5107 KOps/s | |
test_getitem[int] | 1.1971ms | 12.6741μs | 78.9012 KOps/s | 78.7566 KOps/s | |
test_getitem[slice_int] | 0.1442ms | 25.7148μs | 38.8882 KOps/s | 40.9765 KOps/s | |
test_getitem[range] | 0.1710ms | 47.4933μs | 21.0556 KOps/s | 19.9796 KOps/s | |
test_getitem[tuple] | 0.1247ms | 19.9779μs | 50.0552 KOps/s | 50.3482 KOps/s | |
test_getitem[list] | 0.1647ms | 42.2788μs | 23.6525 KOps/s | 22.0492 KOps/s | |
test_setitem_dim[int] | 0.1061ms | 29.4615μs | 33.9426 KOps/s | 39.8461 KOps/s | |
test_setitem_dim[slice_int] | 0.1285ms | 51.2792μs | 19.5011 KOps/s | 19.6614 KOps/s | |
test_setitem_dim[range] | 0.1557ms | 76.6388μs | 13.0482 KOps/s | 13.0616 KOps/s | |
test_setitem_dim[tuple] | 85.0490μs | 41.0388μs | 24.3672 KOps/s | 24.7473 KOps/s | |
test_setitem | 0.2295ms | 18.7805μs | 53.2466 KOps/s | 49.3131 KOps/s | |
test_set | 0.2686ms | 17.6882μs | 56.5348 KOps/s | 49.8732 KOps/s | |
test_set_shared | 3.4013ms | 0.1731ms | 5.7759 KOps/s | 5.8465 KOps/s | |
test_update | 0.2558ms | 19.4309μs | 51.4644 KOps/s | 42.5604 KOps/s | |
test_update_nested | 0.1779ms | 29.6803μs | 33.6924 KOps/s | 30.2615 KOps/s | |
test_update__nested | 0.5273ms | 32.6023μs | 30.6727 KOps/s | 29.2223 KOps/s | |
test_set_nested | 0.1131ms | 20.0701μs | 49.8255 KOps/s | 44.7273 KOps/s | |
test_set_nested_new | 0.1455ms | 24.6838μs | 40.5123 KOps/s | 37.1776 KOps/s | |
test_select | 0.1978ms | 39.9437μs | 25.0352 KOps/s | 23.1433 KOps/s | |
test_select_nested | 0.1282ms | 61.8317μs | 16.1729 KOps/s | 15.9955 KOps/s | |
test_exclude_nested | 0.1644ms | 80.3432μs | 12.4466 KOps/s | 12.2263 KOps/s | |
test_empty[True] | 0.7583ms | 0.4033ms | 2.4798 KOps/s | 2.4697 KOps/s | |
test_empty[False] | 15.6223μs | 1.4503μs | 689.5329 KOps/s | 727.4528 KOps/s | |
test_unbind_speed | 0.4233ms | 0.2654ms | 3.7679 KOps/s | 3.7069 KOps/s | |
test_unbind_speed_stack0 | 0.4125ms | 0.2622ms | 3.8144 KOps/s | 3.7890 KOps/s | |
test_unbind_speed_stack1 | 0.1004s | 0.7849ms | 1.2741 KOps/s | 1.3421 KOps/s | |
test_split | 1.7050ms | 1.5911ms | 628.4814 Ops/s | 562.5200 Ops/s | |
test_chunk | 0.1046s | 1.7683ms | 565.5289 Ops/s | 553.5126 Ops/s | |
test_consolidate_njt[False-None] | 0.1100s | 8.9467ms | 111.7726 Ops/s | 116.8953 Ops/s | |
test_creation[device0] | 1.1356ms | 98.7338μs | 10.1282 KOps/s | 10.7902 KOps/s | |
test_creation_from_tensor | 0.2626ms | 94.0790μs | 10.6294 KOps/s | 10.4616 KOps/s | |
test_add_one[memmap_tensor0] | 0.2423ms | 4.9030μs | 203.9551 KOps/s | 204.3312 KOps/s | |
test_contiguous[memmap_tensor0] | 18.0340μs | 0.5052μs | 1.9794 MOps/s | 1.9552 MOps/s | |
test_stack[memmap_tensor0] | 62.2360μs | 3.3721μs | 296.5516 KOps/s | 284.4937 KOps/s | |
test_memmaptd_index | 0.9257ms | 0.2407ms | 4.1550 KOps/s | 4.2151 KOps/s | |
test_memmaptd_index_astensor | 0.5776ms | 0.3283ms | 3.0457 KOps/s | 3.1023 KOps/s | |
test_memmaptd_index_op | 0.9359ms | 0.5497ms | 1.8190 KOps/s | 1.6794 KOps/s | |
test_serialize_model | 0.1255s | 0.1172s | 8.5354 Ops/s | 8.4530 Ops/s | |
test_serialize_model_pickle | 0.4958s | 0.4062s | 2.4619 Ops/s | 2.5283 Ops/s | |
test_serialize_weights | 0.1280s | 0.1147s | 8.7185 Ops/s | 8.2904 Ops/s | |
test_serialize_weights_returnearly | 0.2786s | 0.1733s | 5.7716 Ops/s | 6.0828 Ops/s | |
test_serialize_weights_pickle | 0.9466s | 0.6547s | 1.5275 Ops/s | 2.4645 Ops/s | |
test_serialize_weights_filesystem | 0.1494s | 0.1417s | 7.0586 Ops/s | 6.7828 Ops/s | |
test_serialize_model_filesystem | 0.1509s | 0.1422s | 7.0322 Ops/s | 6.3100 Ops/s | |
test_reshape_pytree | 71.4430μs | 26.6937μs | 37.4621 KOps/s | 38.1798 KOps/s | |
test_reshape_td | 80.7410μs | 32.3427μs | 30.9189 KOps/s | 29.6118 KOps/s | |
test_view_pytree | 68.3680μs | 26.5107μs | 37.7207 KOps/s | 38.3352 KOps/s | |
test_view_td | 76.0320μs | 36.7952μs | 27.1775 KOps/s | 25.9760 KOps/s | |
test_unbind_pytree | 72.7650μs | 29.4928μs | 33.9066 KOps/s | 34.2937 KOps/s | |
test_unbind_td | 0.3240ms | 39.2208μs | 25.4967 KOps/s | 24.9025 KOps/s | |
test_split_pytree | 66.2140μs | 29.1843μs | 34.2650 KOps/s | 34.7427 KOps/s | |
test_split_td | 0.2199ms | 44.9353μs | 22.2542 KOps/s | 22.0428 KOps/s | |
test_add_pytree | 75.4310μs | 35.2184μs | 28.3942 KOps/s | 28.1077 KOps/s | |
test_add_td | 0.2315ms | 48.6175μs | 20.5687 KOps/s | 17.0883 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1403ms | 63.9582μs | 15.6352 KOps/s | 15.5931 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3651ms | 0.1734ms | 5.7685 KOps/s | 5.6747 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1109ms | 45.2270μs | 22.1107 KOps/s | 21.5888 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2264ms | 0.1180ms | 8.4748 KOps/s | 8.4129 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 65.9120μs | 27.7026μs | 36.0977 KOps/s | 37.7600 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1140ms | 57.9468μs | 17.2572 KOps/s | 17.2060 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1422ms | 78.3828μs | 12.7579 KOps/s | 12.7352 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1447ms | 66.8692μs | 14.9546 KOps/s | 15.0632 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2039ms | 0.1030ms | 9.7114 KOps/s | 9.5063 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3933ms | 0.2130ms | 4.6944 KOps/s | 4.6113 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 93.7740μs | 44.6216μs | 22.4107 KOps/s | 21.2872 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.5298ms | 67.1349μs | 14.8954 KOps/s | 14.9340 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1858ms | 0.1017ms | 9.8370 KOps/s | 9.7981 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5344ms | 0.2057ms | 4.8626 KOps/s | 4.9617 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3461ms | 0.2281ms | 4.3836 KOps/s | 4.3055 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2022ms | 0.1038ms | 9.6318 KOps/s | 9.4643 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1393ms | 62.6789μs | 15.9543 KOps/s | 15.6304 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1862ms | 46.3100μs | 21.5936 KOps/s | 21.0279 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.3310ms | 0.1607ms | 6.2237 KOps/s | 6.3481 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2380ms | 0.1016ms | 9.8459 KOps/s | 9.2365 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 56.0840μs | 20.8316μs | 48.0041 KOps/s | 46.4140 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1471ms | 66.4664μs | 15.0452 KOps/s | 15.1233 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1921ms | 78.3207μs | 12.7680 KOps/s | 12.8555 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1507ms | 70.0174μs | 14.2822 KOps/s | 15.0500 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.4258ms | 0.2038ms | 4.9076 KOps/s | 4.7862 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.4978ms | 1.3228ms | 755.9814 Ops/s | 761.2707 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3550ms | 0.2025ms | 4.9380 KOps/s | 4.9297 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.3675ms | 0.7824ms | 1.2781 KOps/s | 1.2762 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.9117ms | 0.4538ms | 2.2035 KOps/s | 2.2219 KOps/s | |
test_compile_assign_and_add_stack[eager] | 2.8119ms | 2.5317ms | 394.9970 Ops/s | 375.0284 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1026ms | 35.5789μs | 28.1065 KOps/s | 27.8206 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.7355ms | 32.1890μs | 31.0665 KOps/s | 30.5857 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1133ms | 28.8902μs | 34.6138 KOps/s | 34.4499 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1051ms | 23.7953μs | 42.0251 KOps/s | 42.8636 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1110ms | 30.0826μs | 33.2419 KOps/s | 33.8588 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 88.8160μs | 23.8272μs | 41.9688 KOps/s | 43.2496 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1167ms | 51.3922μs | 19.4582 KOps/s | 19.5767 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.6112ms | 20.0957μs | 49.7618 KOps/s | 50.8750 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1523ms | 43.4660μs | 23.0065 KOps/s | 22.8379 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 68.2670μs | 18.6602μs | 53.5901 KOps/s | 54.5147 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1206ms | 44.2922μs | 22.5774 KOps/s | 22.4712 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 77.4850μs | 18.7374μs | 53.3692 KOps/s | 55.2233 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1204ms | 51.5452μs | 19.4004 KOps/s | 19.3766 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 1.0121ms | 19.8254μs | 50.4403 KOps/s | 51.4784 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1251ms | 44.5121μs | 22.4658 KOps/s | 22.4557 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 70.3810μs | 18.6548μs | 53.6055 KOps/s | 55.0752 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1460ms | 44.6778μs | 22.3825 KOps/s | 22.3916 KOps/s | |
test_compile_indexing[int-pytree-eager] | 91.2000μs | 18.7183μs | 53.4237 KOps/s | 55.2872 KOps/s | |
test_mod_add[eager] | 0.1046ms | 32.7530μs | 30.5315 KOps/s | 28.9307 KOps/s | |
test_mod_add[compile] | 0.1713ms | 48.3622μs | 20.6773 KOps/s | 20.8661 KOps/s | |
test_mod_add[compile-overhead] | 0.1071ms | 47.5685μs | 21.0223 KOps/s | 20.8312 KOps/s | |
test_mod_wrap[eager] | 0.4578ms | 0.2235ms | 4.4746 KOps/s | 4.5807 KOps/s | |
test_mod_wrap[compile] | 0.4107ms | 0.2054ms | 4.8679 KOps/s | 4.8043 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3845ms | 0.2057ms | 4.8607 KOps/s | 4.9197 KOps/s | |
test_mod_wrap_and_backward[eager] | 14.5899ms | 12.4329ms | 80.4320 Ops/s | 75.3865 Ops/s | |
test_mod_wrap_and_backward[compile] | 19.2336ms | 13.4903ms | 74.1275 Ops/s | 74.3224 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 18.2843ms | 12.9396ms | 77.2821 Ops/s | 74.6303 Ops/s | |
test_seq_add[eager] | 0.2544ms | 0.1136ms | 8.8006 KOps/s | 8.6231 KOps/s | |
test_seq_add[compile] | 0.1491ms | 63.0168μs | 15.8688 KOps/s | 15.5775 KOps/s | |
test_seq_add[compile-overhead] | 0.1380ms | 59.9933μs | 16.6685 KOps/s | 16.1987 KOps/s | |
test_seq_wrap[eager] | 0.6997ms | 0.4289ms | 2.3314 KOps/s | 2.2154 KOps/s | |
test_seq_wrap[compile] | 0.6885ms | 0.2270ms | 4.4046 KOps/s | 4.2894 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3584ms | 0.2231ms | 4.4821 KOps/s | 4.3375 KOps/s | |
test_func_call_runtime[False-eager] | 0.9674ms | 0.5433ms | 1.8405 KOps/s | 1.8516 KOps/s | |
test_func_call_runtime[False-compile] | 0.7493ms | 0.4207ms | 2.3772 KOps/s | 2.3098 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5243ms | 0.4202ms | 2.3801 KOps/s | 2.3776 KOps/s | |
test_func_call_runtime[True-eager] | 0.9001ms | 0.7526ms | 1.3287 KOps/s | 1.3114 KOps/s | |
test_func_call_runtime[True-compile] | 0.6662ms | 0.4582ms | 2.1824 KOps/s | 2.1514 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.6915ms | 0.4621ms | 2.1639 KOps/s | 2.1551 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8352ms | 0.5448ms | 1.8356 KOps/s | 1.8714 KOps/s | |
test_func_call_cm_runtime[False-compile] | 1.1342ms | 0.4268ms | 2.3430 KOps/s | 2.3689 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5704ms | 0.4199ms | 2.3817 KOps/s | 2.3810 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.4407ms | 0.9011ms | 1.1097 KOps/s | 1.1180 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.5941ms | 0.4835ms | 2.0682 KOps/s | 2.0697 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.6282ms | 0.4826ms | 2.0723 KOps/s | 2.0830 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.6072ms | 1.8896ms | 529.2084 Ops/s | 525.2757 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.6862ms | 0.5187ms | 1.9279 KOps/s | 1.9381 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.7174ms | 0.5209ms | 1.9198 KOps/s | 1.9292 KOps/s | |
test_distributed | 0.3668ms | 0.1254ms | 7.9755 KOps/s | 7.8023 KOps/s | |
test_tdmodule | 75.8910μs | 24.3432μs | 41.0792 KOps/s | 37.7461 KOps/s | |
test_tdmodule_dispatch | 72.5150μs | 43.5911μs | 22.9404 KOps/s | 20.9871 KOps/s | |
test_tdseq | 47.0670μs | 26.2869μs | 38.0418 KOps/s | 35.0551 KOps/s | |
test_tdseq_dispatch | 95.4770μs | 49.4920μs | 20.2053 KOps/s | 18.5958 KOps/s | |
test_instantiation_functorch | 3.0774ms | 1.5121ms | 661.3475 Ops/s | 655.8504 Ops/s | |
test_exec_functorch | 0.3247ms | 0.1810ms | 5.5254 KOps/s | 5.6397 KOps/s | |
test_exec_functional_call | 0.2956ms | 0.1693ms | 5.9078 KOps/s | 5.8658 KOps/s | |
test_exec_td_decorator | 0.5058ms | 0.2276ms | 4.3936 KOps/s | 4.3938 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.9553ms | 0.6473ms | 1.5449 KOps/s | 1.4877 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.3060ms | 0.6493ms | 1.5402 KOps/s | 1.5430 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8832ms | 0.5303ms | 1.8857 KOps/s | 1.8986 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 1.1113ms | 0.5342ms | 1.8721 KOps/s | 1.9072 KOps/s | |
test_to_module_speed[True] | 1.7597ms | 1.3357ms | 748.6711 Ops/s | 736.1793 Ops/s | |
test_to_module_speed[False] | 1.9305ms | 1.3032ms | 767.3161 Ops/s | 743.6382 Ops/s | |
test_tc_init | 73.6770μs | 41.5814μs | 24.0492 KOps/s | 21.4711 KOps/s | |
test_tc_init_nested | 0.1656ms | 81.6932μs | 12.2409 KOps/s | 10.2505 KOps/s | |
test_tc_first_layer_tensor | 15.7790μs | 1.5236μs | 656.3321 KOps/s | 661.4717 KOps/s | |
test_tc_first_layer_nontensor | 26.7690μs | 4.6510μs | 215.0060 KOps/s | 217.0874 KOps/s | |
test_tc_second_layer_tensor | 43.4740μs | 2.7661μs | 361.5186 KOps/s | 355.1725 KOps/s | |
test_tc_second_layer_nontensor | 32.0900μs | 5.9967μs | 166.7587 KOps/s | 168.2494 KOps/s | |
test_unbind | 0.2169s | 13.3926ms | 74.6682 Ops/s | 78.2489 Ops/s | |
test_full_like | 8.4709ms | 7.1128ms | 140.5907 Ops/s | 81.4874 Ops/s | |
test_zeros_like | 3.2627ms | 2.7222ms | 367.3523 Ops/s | 138.3856 Ops/s | |
test_ones_like | 3.7025ms | 3.2233ms | 310.2372 Ops/s | 133.2428 Ops/s | |
test_clone | 8.5705ms | 5.2735ms | 189.6256 Ops/s | 110.1729 Ops/s | |
test_squeeze | 81.8930μs | 11.9466μs | 83.7061 KOps/s | 83.7175 KOps/s | |
test_unsqueeze | 0.2361ms | 88.2136μs | 11.3361 KOps/s | 10.7024 KOps/s | |
test_split | 0.3615ms | 0.1909ms | 5.2392 KOps/s | 5.1914 KOps/s | |
test_permute | 0.3863ms | 0.2003ms | 4.9936 KOps/s | 4.9049 KOps/s | |
test_stack | 29.2729ms | 25.3166ms | 39.4997 Ops/s | 39.8566 Ops/s | |
test_cat | 28.5507ms | 24.7456ms | 40.4112 Ops/s | 40.0578 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 34.7220μs | 11.2547μs | 88.8519 KOps/s | 79.5299 KOps/s | |
test_plain_set_stack_nested | 35.5320μs | 11.4421μs | 87.3966 KOps/s | 78.4291 KOps/s | |
test_plain_set_nested_inplace | 40.2020μs | 12.2643μs | 81.5378 KOps/s | 72.4528 KOps/s | |
test_plain_set_stack_nested_inplace | 0.1095ms | 12.3334μs | 81.0804 KOps/s | 73.2179 KOps/s | |
test_items | 30.1120μs | 2.8884μs | 346.2111 KOps/s | 344.0520 KOps/s | |
test_items_nested | 0.4092ms | 0.3599ms | 2.7788 KOps/s | 2.7740 KOps/s | |
test_items_nested_locked | 0.4159ms | 0.3599ms | 2.7787 KOps/s | 2.7717 KOps/s | |
test_items_nested_leaf | 79.9440μs | 57.8218μs | 17.2945 KOps/s | 17.2955 KOps/s | |
test_items_stack_nested | 0.4285ms | 0.3617ms | 2.7644 KOps/s | 2.7583 KOps/s | |
test_items_stack_nested_leaf | 86.8750μs | 58.6112μs | 17.0616 KOps/s | 16.8370 KOps/s | |
test_items_stack_nested_locked | 0.4368ms | 0.3639ms | 2.7483 KOps/s | 2.7455 KOps/s | |
test_keys | 32.3810μs | 3.6075μs | 277.2020 KOps/s | 292.3105 KOps/s | |
test_keys_nested | 0.1185ms | 88.1540μs | 11.3438 KOps/s | 11.4063 KOps/s | |
test_keys_nested_locked | 0.8121ms | 93.7005μs | 10.6723 KOps/s | 10.7166 KOps/s | |
test_keys_nested_leaf | 0.1156ms | 78.4276μs | 12.7506 KOps/s | 12.8466 KOps/s | |
test_keys_stack_nested | 0.1146ms | 90.5310μs | 11.0459 KOps/s | 11.2482 KOps/s | |
test_keys_stack_nested_leaf | 0.1295ms | 80.2567μs | 12.4600 KOps/s | 12.3777 KOps/s | |
test_keys_stack_nested_locked | 0.1217ms | 95.1842μs | 10.5059 KOps/s | 10.4785 KOps/s | |
test_values | 7.9320μs | 0.8489μs | 1.1780 MOps/s | 1.1798 MOps/s | |
test_values_nested | 71.9240μs | 37.5900μs | 26.6028 KOps/s | 26.8613 KOps/s | |
test_values_nested_locked | 0.3354ms | 39.3178μs | 25.4338 KOps/s | 25.6168 KOps/s | |
test_values_nested_leaf | 78.7540μs | 41.9900μs | 23.8152 KOps/s | 24.0621 KOps/s | |
test_values_stack_nested | 73.0640μs | 38.1301μs | 26.2260 KOps/s | 26.3485 KOps/s | |
test_values_stack_nested_leaf | 77.7640μs | 42.3154μs | 23.6321 KOps/s | 23.7699 KOps/s | |
test_values_stack_nested_locked | 74.6140μs | 40.0397μs | 24.9752 KOps/s | 25.0839 KOps/s | |
test_membership | 1.7366μs | 0.5012μs | 1.9951 MOps/s | 1.9530 MOps/s | |
test_membership_nested | 33.3320μs | 2.0376μs | 490.7641 KOps/s | 505.7721 KOps/s | |
test_membership_nested_leaf | 15.2955μs | 2.0040μs | 499.0048 KOps/s | 504.7272 KOps/s | |
test_membership_stacked_nested | 32.5120μs | 2.0303μs | 492.5351 KOps/s | 485.0938 KOps/s | |
test_membership_stacked_nested_leaf | 40.1220μs | 2.0480μs | 488.2922 KOps/s | 488.7442 KOps/s | |
test_membership_nested_last | 35.7520μs | 2.9939μs | 334.0149 KOps/s | 330.8566 KOps/s | |
test_membership_nested_leaf_last | 27.6020μs | 2.9744μs | 336.2056 KOps/s | 332.4787 KOps/s | |
test_membership_stacked_nested_last | 28.8820μs | 2.9894μs | 334.5109 KOps/s | 281.4104 KOps/s | |
test_membership_stacked_nested_leaf_last | 39.5520μs | 2.9966μs | 333.7097 KOps/s | 286.5069 KOps/s | |
test_nested_getleaf | 42.1320μs | 6.0478μs | 165.3481 KOps/s | 164.7907 KOps/s | |
test_nested_get | 34.4620μs | 5.7491μs | 173.9409 KOps/s | 174.0153 KOps/s | |
test_stacked_getleaf | 33.4820μs | 6.0815μs | 164.4337 KOps/s | 163.6704 KOps/s | |
test_stacked_get | 41.9020μs | 5.7690μs | 173.3393 KOps/s | 172.5653 KOps/s | |
test_nested_getitemleaf | 36.5120μs | 6.4218μs | 155.7201 KOps/s | 155.6618 KOps/s | |
test_nested_getitem | 31.7720μs | 6.0743μs | 164.6270 KOps/s | 166.0551 KOps/s | |
test_stacked_getitemleaf | 45.7520μs | 6.3758μs | 156.8427 KOps/s | 156.1297 KOps/s | |
test_stacked_getitem | 38.5220μs | 6.1000μs | 163.9347 KOps/s | 165.2739 KOps/s | |
test_lock_nested | 1.1250ms | 0.3735ms | 2.6772 KOps/s | 2.6635 KOps/s | |
test_lock_stack_nested | 0.4705ms | 0.3474ms | 2.8788 KOps/s | 2.9346 KOps/s | |
test_unlock_nested | 0.6944ms | 0.3183ms | 3.1416 KOps/s | 3.2518 KOps/s | |
test_unlock_stack_nested | 0.3262ms | 0.2846ms | 3.5140 KOps/s | 3.5886 KOps/s | |
test_flatten_speed | 0.1157ms | 74.7613μs | 13.3759 KOps/s | 13.4021 KOps/s | |
test_unflatten_speed | 0.3687ms | 0.3195ms | 3.1295 KOps/s | 3.1639 KOps/s | |
test_common_ops | 1.6446ms | 0.5585ms | 1.7904 KOps/s | 1.6355 KOps/s | |
test_creation | 0.1715ms | 1.7096μs | 584.9290 KOps/s | 590.7755 KOps/s | |
test_creation_empty | 38.5320μs | 6.2687μs | 159.5231 KOps/s | 114.3177 KOps/s | |
test_creation_nested_1 | 35.2020μs | 7.9421μs | 125.9106 KOps/s | 95.3676 KOps/s | |
test_creation_nested_2 | 46.3720μs | 10.7329μs | 93.1714 KOps/s | 76.7714 KOps/s | |
test_clone | 0.1243ms | 10.2347μs | 97.7065 KOps/s | 100.3034 KOps/s | |
test_getitem[int] | 1.9867ms | 11.0920μs | 90.1549 KOps/s | 93.2923 KOps/s | |
test_getitem[slice_int] | 0.1155ms | 21.8072μs | 45.8564 KOps/s | 49.1642 KOps/s | |
test_getitem[range] | 0.1327ms | 37.4109μs | 26.7302 KOps/s | 28.2197 KOps/s | |
test_getitem[tuple] | 0.1112ms | 18.6891μs | 53.5071 KOps/s | 55.1818 KOps/s | |
test_getitem[list] | 0.1313ms | 32.5971μs | 30.6776 KOps/s | 31.5977 KOps/s | |
test_setitem_dim[int] | 40.8120μs | 19.1747μs | 52.1520 KOps/s | 54.4386 KOps/s | |
test_setitem_dim[slice_int] | 61.5030μs | 38.0382μs | 26.2893 KOps/s | 27.2815 KOps/s | |
test_setitem_dim[range] | 77.2740μs | 51.9584μs | 19.2462 KOps/s | 19.7108 KOps/s | |
test_setitem_dim[tuple] | 65.8040μs | 32.4320μs | 30.8338 KOps/s | 32.2009 KOps/s | |
test_setitem | 45.3520μs | 13.5758μs | 73.6605 KOps/s | 67.7661 KOps/s | |
test_set | 0.1326ms | 12.9779μs | 77.0540 KOps/s | 70.0116 KOps/s | |
test_set_shared | 1.4527ms | 0.1489ms | 6.7160 KOps/s | 6.6982 KOps/s | |
test_update | 0.8622ms | 15.0018μs | 66.6586 KOps/s | 57.0089 KOps/s | |
test_update_nested | 0.1321ms | 20.3521μs | 49.1350 KOps/s | 43.3763 KOps/s | |
test_update__nested | 1.3175ms | 24.9415μs | 40.0939 KOps/s | 41.2102 KOps/s | |
test_set_nested | 0.1255ms | 14.5429μs | 68.7622 KOps/s | 65.2803 KOps/s | |
test_set_nested_new | 0.1246ms | 16.9654μs | 58.9436 KOps/s | 56.5187 KOps/s | |
test_select | 0.1413ms | 28.6981μs | 34.8455 KOps/s | 33.5196 KOps/s | |
test_select_nested | 75.3140μs | 43.6877μs | 22.8897 KOps/s | 22.7982 KOps/s | |
test_exclude_nested | 92.6050μs | 61.5834μs | 16.2381 KOps/s | 16.0411 KOps/s | |
test_empty[True] | 0.3491ms | 0.2932ms | 3.4107 KOps/s | 3.4376 KOps/s | |
test_empty[False] | 4.5782μs | 0.8221μs | 1.2164 MOps/s | 1.2179 MOps/s | |
test_to | 84.7050μs | 56.2844μs | 17.7669 KOps/s | 17.1846 KOps/s | |
test_to_nonblocking | 0.1200ms | 46.2809μs | 21.6072 KOps/s | 21.1275 KOps/s | |
test_unbind_speed | 0.8247ms | 0.2412ms | 4.1468 KOps/s | 4.2411 KOps/s | |
test_unbind_speed_stack0 | 0.2915ms | 0.2393ms | 4.1780 KOps/s | 4.2398 KOps/s | |
test_unbind_speed_stack1 | 0.6728ms | 0.6182ms | 1.6176 KOps/s | 1.5049 KOps/s | |
test_split | 95.1560ms | 1.6468ms | 607.2350 Ops/s | 622.4164 Ops/s | |
test_chunk | 97.0090ms | 1.6411ms | 609.3517 Ops/s | 620.3660 Ops/s | |
test_consolidate[False-None] | 97.0805ms | 2.8843ms | 346.7095 Ops/s | 341.4397 Ops/s | |
test_consolidate[default-None] | 1.8007ms | 1.7126ms | 583.8972 Ops/s | 601.5385 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.8650ms | 1.7555ms | 569.6533 Ops/s | 590.4789 Ops/s | |
test_consolidate_njt[False-None] | 6.7159ms | 6.4852ms | 154.1962 Ops/s | 156.0760 Ops/s | |
test_to[False-False-None] | 1.8143ms | 1.7010ms | 587.8758 Ops/s | 589.0270 Ops/s | |
test_to[True-False-None] | 1.4956ms | 1.2986ms | 770.0760 Ops/s | 796.2186 Ops/s | |
test_to[within-False-None] | 4.3647ms | 4.1011ms | 243.8373 Ops/s | 178.4825 Ops/s | |
test_to[True-default-None] | 5.5659ms | 5.3393ms | 187.2896 Ops/s | 193.9373 Ops/s | |
test_to_njt[False-False-None] | 7.0973ms | 6.7955ms | 147.1556 Ops/s | 143.1369 Ops/s | |
test_to_njt[True-False-None] | 5.6958ms | 5.4001ms | 185.1833 Ops/s | 181.9411 Ops/s | |
test_to_njt[within-False-None] | 12.4818ms | 12.1266ms | 82.4631 Ops/s | 82.8275 Ops/s | |
test_creation[device0] | 0.6185ms | 80.3841μs | 12.4403 KOps/s | 12.1822 KOps/s | |
test_creation_from_tensor | 0.5955ms | 84.5659μs | 11.8251 KOps/s | 11.6993 KOps/s | |
test_add_one[memmap_tensor0] | 0.4075ms | 6.3254μs | 158.0937 KOps/s | 160.5520 KOps/s | |
test_contiguous[memmap_tensor0] | 2.1076μs | 0.3992μs | 2.5049 MOps/s | 2.4666 MOps/s | |
test_stack[memmap_tensor0] | 52.9120μs | 4.6050μs | 217.1559 KOps/s | 219.7225 KOps/s | |
test_memmaptd_index | 1.5123ms | 0.2597ms | 3.8507 KOps/s | 4.0186 KOps/s | |
test_memmaptd_index_astensor | 0.5880ms | 0.3237ms | 3.0889 KOps/s | 3.2148 KOps/s | |
test_memmaptd_index_op | 1.0530ms | 0.5554ms | 1.8004 KOps/s | 1.7232 KOps/s | |
test_serialize_model | 0.1318s | 0.1312s | 7.6243 Ops/s | 7.6426 Ops/s | |
test_serialize_model_pickle | 1.3776s | 1.2190s | 0.8204 Ops/s | 0.8443 Ops/s | |
test_serialize_weights | 0.1310s | 0.1304s | 7.6716 Ops/s | 7.6867 Ops/s | |
test_serialize_weights_returnearly | 0.5131s | 72.7543ms | 13.7449 Ops/s | 15.3446 Ops/s | |
test_serialize_weights_pickle | 1.3772s | 1.2161s | 0.8223 Ops/s | 0.8222 Ops/s | |
test_reshape_pytree | 55.1830μs | 21.8910μs | 45.6808 KOps/s | 45.2079 KOps/s | |
test_reshape_td | 59.2630μs | 26.4187μs | 37.8520 KOps/s | 37.4825 KOps/s | |
test_view_pytree | 47.1220μs | 21.8535μs | 45.7592 KOps/s | 45.8195 KOps/s | |
test_view_td | 74.5140μs | 31.6835μs | 31.5622 KOps/s | 33.9184 KOps/s | |
test_unbind_pytree | 56.4530μs | 27.9367μs | 35.7952 KOps/s | 35.8382 KOps/s | |
test_unbind_td | 1.0009ms | 36.3647μs | 27.4992 KOps/s | 27.8305 KOps/s | |
test_split_pytree | 64.1740μs | 29.4431μs | 33.9638 KOps/s | 32.7569 KOps/s | |
test_split_td | 0.1818ms | 39.3721μs | 25.3987 KOps/s | 25.9971 KOps/s | |
test_add_pytree | 69.1840μs | 32.9244μs | 30.3726 KOps/s | 30.7799 KOps/s | |
test_add_td | 84.2440μs | 43.2222μs | 23.1362 KOps/s | 20.8011 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1765ms | 0.1214ms | 8.2367 KOps/s | 7.7906 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2271ms | 0.1307ms | 7.6524 KOps/s | 7.6110 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1327ms | 93.9959μs | 10.6388 KOps/s | 10.7203 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2029ms | 0.1483ms | 6.7420 KOps/s | 6.7362 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 59.5930μs | 25.2350μs | 39.6275 KOps/s | 44.3595 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 57.4630μs | 28.7693μs | 34.7593 KOps/s | 34.0518 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.3343ms | 64.6168μs | 15.4759 KOps/s | 15.6510 KOps/s | |
test_compile_copy_nested[pytree-eager] | 73.7740μs | 48.8258μs | 20.4810 KOps/s | 20.1801 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1831ms | 0.1407ms | 7.1058 KOps/s | 7.1557 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3188ms | 0.2176ms | 4.5952 KOps/s | 4.6463 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1580ms | 0.1011ms | 9.8878 KOps/s | 10.3629 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1213ms | 57.6444μs | 17.3477 KOps/s | 18.0375 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1750ms | 0.1346ms | 7.4294 KOps/s | 7.2501 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5871ms | 0.4809ms | 2.0795 KOps/s | 2.0813 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4004ms | 0.2610ms | 3.8318 KOps/s | 3.8235 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.1824ms | 0.1425ms | 7.0158 KOps/s | 6.8394 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1644ms | 66.0051μs | 15.1503 KOps/s | 15.0853 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1371ms | 98.9696μs | 10.1041 KOps/s | 10.2648 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4570ms | 0.4099ms | 2.4397 KOps/s | 2.4690 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1762ms | 0.1343ms | 7.4479 KOps/s | 7.5053 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 50.7020μs | 19.2374μs | 51.9821 KOps/s | 56.2192 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 67.4140μs | 30.7192μs | 32.5530 KOps/s | 31.9528 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1071ms | 69.5336μs | 14.3815 KOps/s | 14.3566 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1908ms | 50.6273μs | 19.7522 KOps/s | 19.7803 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6005ms | 0.3855ms | 2.5943 KOps/s | 2.2404 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.7965ms | 2.5825ms | 387.2158 Ops/s | 379.6917 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.5750ms | 0.4291ms | 2.3302 KOps/s | 2.3499 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.7035ms | 2.6046ms | 383.9422 Ops/s | 386.8415 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1664ms | 0.1159ms | 8.6292 KOps/s | 9.1939 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5551ms | 80.4090μs | 12.4364 KOps/s | 13.1836 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1633ms | 0.1067ms | 9.3694 KOps/s | 9.8955 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1296ms | 69.4867μs | 14.3912 KOps/s | 15.2716 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1636ms | 0.1084ms | 9.2267 KOps/s | 9.7992 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1847ms | 67.1319μs | 14.8961 KOps/s | 15.2948 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1597ms | 0.1001ms | 9.9881 KOps/s | 10.1860 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1418ms | 17.3407μs | 57.6679 KOps/s | 58.3530 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1454ms | 95.9372μs | 10.4235 KOps/s | 10.5737 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 56.3730μs | 15.8989μs | 62.8974 KOps/s | 63.7575 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1352ms | 95.5633μs | 10.4643 KOps/s | 10.5376 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 49.2830μs | 15.7897μs | 63.3322 KOps/s | 64.6620 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1416ms | 0.1010ms | 9.9002 KOps/s | 10.0684 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5765ms | 16.7003μs | 59.8792 KOps/s | 58.9535 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1571ms | 95.5840μs | 10.4620 KOps/s | 10.5284 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 63.2030μs | 15.8193μs | 63.2141 KOps/s | 64.2480 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1351ms | 93.8926μs | 10.6505 KOps/s | 10.5200 KOps/s | |
test_compile_indexing[int-pytree-eager] | 49.0530μs | 15.7693μs | 63.4145 KOps/s | 63.9852 KOps/s | |
test_mod_add[eager] | 0.1453ms | 35.5214μs | 28.1520 KOps/s | 25.7210 KOps/s | |
test_mod_add[compile] | 0.1321ms | 79.4973μs | 12.5790 KOps/s | 12.6481 KOps/s | |
test_mod_add[compile-overhead] | 0.3257ms | 0.1656ms | 6.0399 KOps/s | 5.6686 KOps/s | |
test_mod_wrap[eager] | 0.3126ms | 0.2360ms | 4.2375 KOps/s | 4.0491 KOps/s | |
test_mod_wrap[compile] | 0.6765ms | 0.2752ms | 3.6342 KOps/s | 3.6157 KOps/s | |
test_mod_wrap[compile-overhead] | 6.9577ms | 3.6533ms | 273.7225 Ops/s | 263.4296 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.7785ms | 1.3383ms | 747.2186 Ops/s | 711.8402 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.3172ms | 1.2297ms | 813.2384 Ops/s | 789.2293 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3366ms | 0.8989ms | 1.1125 KOps/s | 1.1009 KOps/s | |
test_seq_add[eager] | 0.1460ms | 0.1098ms | 9.1108 KOps/s | 8.7351 KOps/s | |
test_seq_add[compile] | 0.1409ms | 86.6636μs | 11.5389 KOps/s | 11.6046 KOps/s | |
test_seq_add[compile-overhead] | 0.1820ms | 0.1276ms | 7.8392 KOps/s | 7.7668 KOps/s | |
test_seq_wrap[eager] | 0.4791ms | 0.3983ms | 2.5107 KOps/s | 2.4342 KOps/s | |
test_seq_wrap[compile] | 0.3584ms | 0.2902ms | 3.4453 KOps/s | 3.4004 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2687ms | 0.2197ms | 4.5515 KOps/s | 4.5381 KOps/s | |
test_func_call_runtime[False-eager] | 0.7675ms | 0.7020ms | 1.4246 KOps/s | 1.3456 KOps/s | |
test_func_call_runtime[False-compile] | 0.9005ms | 0.7183ms | 1.3921 KOps/s | 1.3920 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.3995ms | 0.3503ms | 2.8546 KOps/s | 2.8633 KOps/s | |
test_func_call_runtime[True-eager] | 0.9612ms | 0.8586ms | 1.1647 KOps/s | 1.1434 KOps/s | |
test_func_call_runtime[True-compile] | 0.7887ms | 0.7354ms | 1.3598 KOps/s | 1.3603 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4307ms | 0.3709ms | 2.6964 KOps/s | 2.7079 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.7517ms | 0.7010ms | 1.4265 KOps/s | 1.4252 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.7813ms | 0.7209ms | 1.3871 KOps/s | 1.3861 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4019ms | 0.3544ms | 2.8216 KOps/s | 2.8409 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0639ms | 0.9702ms | 1.0307 KOps/s | 1.0169 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.8184ms | 0.7685ms | 1.3012 KOps/s | 1.3016 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.4791ms | 0.3956ms | 2.5279 KOps/s | 2.5188 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.4242ms | 1.9867ms | 503.3438 Ops/s | 498.5650 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.8716ms | 0.7982ms | 1.2528 KOps/s | 1.2748 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4621ms | 0.4011ms | 2.4931 KOps/s | 2.4959 KOps/s | |
test_distributed | 2.8675ms | 0.2493ms | 4.0116 KOps/s | 7.9003 KOps/s | |
test_tdmodule | 0.2731ms | 18.9941μs | 52.6479 KOps/s | 49.4999 KOps/s | |
test_tdmodule_dispatch | 62.8130μs | 33.2606μs | 30.0656 KOps/s | 27.5137 KOps/s | |
test_tdseq | 26.8510μs | 19.3759μs | 51.6104 KOps/s | 46.9755 KOps/s | |
test_tdseq_dispatch | 64.9330μs | 35.5195μs | 28.1536 KOps/s | 25.1208 KOps/s | |
test_instantiation_functorch | 1.6917ms | 1.5088ms | 662.7574 Ops/s | 658.6560 Ops/s | |
test_exec_functorch | 0.2027ms | 0.1427ms | 7.0080 KOps/s | 7.1140 KOps/s | |
test_exec_functional_call | 0.1766ms | 0.1320ms | 7.5782 KOps/s | 7.6413 KOps/s | |
test_exec_td_decorator | 0.3721ms | 0.1800ms | 5.5565 KOps/s | 5.5423 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.7982ms | 0.6566ms | 1.5229 KOps/s | 1.4978 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.7733ms | 0.6596ms | 1.5160 KOps/s | 1.4884 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7653ms | 0.5714ms | 1.7501 KOps/s | 1.6478 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7021ms | 0.5759ms | 1.7364 KOps/s | 1.6488 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 18.5657ms | 18.4701ms | 54.1415 Ops/s | 53.8538 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 18.9438ms | 18.4736ms | 54.1313 Ops/s | 53.1730 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.5586ms | 18.3743ms | 54.4237 Ops/s | 54.1865 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.2815ms | 18.4060ms | 54.3300 Ops/s | 53.9413 Ops/s | |
test_to_module_speed[True] | 1.3219ms | 0.9630ms | 1.0384 KOps/s | 1.0168 KOps/s | |
test_to_module_speed[False] | 1.4717ms | 0.9498ms | 1.0529 KOps/s | 1.0251 KOps/s | |
test_tc_init | 84.6540μs | 33.4077μs | 29.9332 KOps/s | 26.9109 KOps/s | |
test_tc_init_nested | 99.3760μs | 65.3180μs | 15.3097 KOps/s | 13.3784 KOps/s | |
test_tc_first_layer_tensor | 6.3161μs | 0.6870μs | 1.4556 MOps/s | 1.2415 MOps/s | |
test_tc_first_layer_nontensor | 23.6810μs | 2.2373μs | 446.9742 KOps/s | 443.8622 KOps/s | |
test_tc_second_layer_tensor | 63.2467μs | 1.3990μs | 714.8042 KOps/s | 689.0311 KOps/s | |
test_tc_second_layer_nontensor | 47.4930μs | 2.9217μs | 342.2635 KOps/s | 330.2787 KOps/s | |
test_unbind | 0.2338s | 10.1977ms | 98.0615 Ops/s | 142.8387 Ops/s | |
test_full_like | 10.1944ms | 9.1677ms | 109.0790 Ops/s | 109.1917 Ops/s | |
test_zeros_like | 5.2727ms | 4.3256ms | 231.1814 Ops/s | 231.7920 Ops/s | |
test_ones_like | 4.9001ms | 4.3278ms | 231.0653 Ops/s | 231.7467 Ops/s | |
test_clone | 6.7279ms | 6.3067ms | 158.5625 Ops/s | 158.1899 Ops/s | |
test_squeeze | 57.3030μs | 9.1812μs | 108.9182 KOps/s | 108.1743 KOps/s | |
test_unsqueeze | 0.1462ms | 72.8202μs | 13.7325 KOps/s | 14.1147 KOps/s | |
test_split | 0.5813ms | 0.1664ms | 6.0086 KOps/s | 6.2868 KOps/s | |
test_permute | 0.2464ms | 0.1682ms | 5.9440 KOps/s | 5.5992 KOps/s | |
test_stack | 50.9293ms | 50.2353ms | 19.9063 Ops/s | 19.9300 Ops/s | |
test_cat | 50.6232ms | 50.1617ms | 19.9355 Ops/s | 19.9800 Ops/s |
vmoens
added a commit
that referenced
this pull request
Jan 16, 2025
ghstack-source-id: 83e79abda6a4bb6839d99240052323380981855c Pull Request resolved: #1186
3 tasks
vmoens
added a commit
that referenced
this pull request
Jan 20, 2025
ghstack-source-id: 83e79abda6a4bb6839d99240052323380981855c Pull Request resolved: #1186
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):