-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] TensorDictCatView #1037
Open
vmoens
wants to merge
2
commits into
gh/vmoens/28/base
Choose a base branch
from
gh/vmoens/28/head
base: gh/vmoens/28/base
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Oct 10, 2024
ghstack-source-id: eda35393bab9de459fc01c6e33a872ffb1b1672a Pull Request resolved: #1037
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Oct 10, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 60.6330μs | 24.9939μs | 40.0098 KOps/s | 39.9619 KOps/s | |
test_plain_set_stack_nested | 58.7490μs | 25.2740μs | 39.5664 KOps/s | 39.2135 KOps/s | |
test_plain_set_nested_inplace | 70.0000μs | 27.4718μs | 36.4009 KOps/s | 36.5925 KOps/s | |
test_plain_set_stack_nested_inplace | 65.6420μs | 27.4093μs | 36.4839 KOps/s | 36.7818 KOps/s | |
test_items | 24.4660μs | 4.1839μs | 239.0108 KOps/s | 236.7527 KOps/s | |
test_items_nested | 0.4745ms | 0.3917ms | 2.5533 KOps/s | 2.5763 KOps/s | |
test_items_nested_locked | 0.6300ms | 0.3917ms | 2.5531 KOps/s | 2.5883 KOps/s | |
test_items_nested_leaf | 0.1451ms | 81.9545μs | 12.2019 KOps/s | 12.4064 KOps/s | |
test_items_stack_nested | 0.8350ms | 0.3922ms | 2.5496 KOps/s | 2.5787 KOps/s | |
test_items_stack_nested_leaf | 0.1521ms | 83.7155μs | 11.9452 KOps/s | 12.1643 KOps/s | |
test_items_stack_nested_locked | 0.6407ms | 0.3986ms | 2.5091 KOps/s | 2.5877 KOps/s | |
test_keys | 32.1900μs | 3.6189μs | 276.3243 KOps/s | 286.6842 KOps/s | |
test_keys_nested | 0.2282ms | 0.1369ms | 7.3067 KOps/s | 7.4664 KOps/s | |
test_keys_nested_locked | 0.6492ms | 0.1431ms | 6.9875 KOps/s | 7.2253 KOps/s | |
test_keys_nested_leaf | 0.2127ms | 0.1203ms | 8.3123 KOps/s | 8.3585 KOps/s | |
test_keys_stack_nested | 0.5361ms | 0.1420ms | 7.0446 KOps/s | 7.4483 KOps/s | |
test_keys_stack_nested_leaf | 0.1843ms | 0.1197ms | 8.3542 KOps/s | 8.5344 KOps/s | |
test_keys_stack_nested_locked | 0.2396ms | 0.1414ms | 7.0724 KOps/s | 7.1881 KOps/s | |
test_values | 7.0372μs | 1.0522μs | 950.4051 KOps/s | 957.0757 KOps/s | |
test_values_nested | 0.1557ms | 95.2079μs | 10.5033 KOps/s | 10.4245 KOps/s | |
test_values_nested_locked | 0.3051ms | 95.3930μs | 10.4830 KOps/s | 10.5562 KOps/s | |
test_values_nested_leaf | 0.1485ms | 81.5929μs | 12.2560 KOps/s | 12.5215 KOps/s | |
test_values_stack_nested | 0.2140ms | 93.8951μs | 10.6502 KOps/s | 10.7123 KOps/s | |
test_values_stack_nested_leaf | 0.1341ms | 79.2534μs | 12.6178 KOps/s | 12.6195 KOps/s | |
test_values_stack_nested_locked | 0.1708ms | 95.2544μs | 10.4982 KOps/s | 10.6905 KOps/s | |
test_membership | 24.6860μs | 0.9139μs | 1.0943 MOps/s | 1.1588 MOps/s | |
test_membership_nested | 20.8190μs | 2.8447μs | 351.5292 KOps/s | 366.9674 KOps/s | |
test_membership_nested_leaf | 36.7190μs | 2.8554μs | 350.2127 KOps/s | 350.1097 KOps/s | |
test_membership_stacked_nested | 20.5580μs | 2.8365μs | 352.5476 KOps/s | 365.3951 KOps/s | |
test_membership_stacked_nested_leaf | 42.2080μs | 2.8556μs | 350.1899 KOps/s | 368.7424 KOps/s | |
test_membership_nested_last | 37.0290μs | 4.2981μs | 232.6625 KOps/s | 238.8191 KOps/s | |
test_membership_nested_leaf_last | 32.3800μs | 4.3654μs | 229.0731 KOps/s | 240.8519 KOps/s | |
test_membership_stacked_nested_last | 25.8580μs | 4.3296μs | 230.9673 KOps/s | 238.4344 KOps/s | |
test_membership_stacked_nested_leaf_last | 34.3240μs | 4.3044μs | 232.3188 KOps/s | 243.3476 KOps/s | |
test_nested_getleaf | 43.4500μs | 10.7183μs | 93.2980 KOps/s | 95.1960 KOps/s | |
test_nested_get | 37.7810μs | 10.3022μs | 97.0663 KOps/s | 99.1101 KOps/s | |
test_stacked_getleaf | 44.5030μs | 10.6371μs | 94.0103 KOps/s | 94.5479 KOps/s | |
test_stacked_get | 39.8840μs | 10.2416μs | 97.6407 KOps/s | 99.0488 KOps/s | |
test_nested_getitemleaf | 33.3220μs | 11.4035μs | 87.6922 KOps/s | 90.9459 KOps/s | |
test_nested_getitem | 44.6430μs | 10.5524μs | 94.7655 KOps/s | 99.5729 KOps/s | |
test_stacked_getitemleaf | 38.9130μs | 10.9906μs | 90.9866 KOps/s | 91.3768 KOps/s | |
test_stacked_getitem | 0.5722ms | 10.2911μs | 97.1709 KOps/s | 98.7044 KOps/s | |
test_lock_nested | 4.9771ms | 0.5379ms | 1.8591 KOps/s | 2.0005 KOps/s | |
test_lock_stack_nested | 0.9117ms | 0.4944ms | 2.0227 KOps/s | 2.1145 KOps/s | |
test_unlock_nested | 0.1022s | 0.5506ms | 1.8163 KOps/s | 2.4094 KOps/s | |
test_unlock_stack_nested | 1.3462ms | 0.4052ms | 2.4682 KOps/s | 2.5635 KOps/s | |
test_flatten_speed | 0.2287ms | 0.1015ms | 9.8547 KOps/s | 10.0965 KOps/s | |
test_unflatten_speed | 1.0516ms | 0.5252ms | 1.9040 KOps/s | 2.0007 KOps/s | |
test_common_ops | 3.4345ms | 1.1564ms | 864.7750 Ops/s | 846.6582 Ops/s | |
test_creation | 0.1198ms | 2.2482μs | 444.7945 KOps/s | 486.5028 KOps/s | |
test_creation_empty | 55.3940μs | 19.1510μs | 52.2165 KOps/s | 49.1066 KOps/s | |
test_creation_nested_1 | 76.1720μs | 23.0830μs | 43.3220 KOps/s | 42.3875 KOps/s | |
test_creation_nested_2 | 76.1320μs | 27.6818μs | 36.1248 KOps/s | 36.2274 KOps/s | |
test_clone | 93.1330μs | 16.9640μs | 58.9485 KOps/s | 57.1630 KOps/s | |
test_getitem[int] | 0.9389ms | 16.9662μs | 58.9408 KOps/s | 60.3052 KOps/s | |
test_getitem[slice_int] | 0.1801ms | 31.6195μs | 31.6260 KOps/s | 32.0329 KOps/s | |
test_getitem[range] | 0.5499ms | 59.3622μs | 16.8457 KOps/s | 17.2583 KOps/s | |
test_getitem[tuple] | 0.1338ms | 25.6139μs | 39.0413 KOps/s | 39.1521 KOps/s | |
test_getitem[list] | 0.4006ms | 53.9559μs | 18.5337 KOps/s | 18.9235 KOps/s | |
test_setitem_dim[int] | 64.6210μs | 32.5527μs | 30.7194 KOps/s | 30.6024 KOps/s | |
test_setitem_dim[slice_int] | 0.1168ms | 60.6923μs | 16.4766 KOps/s | 16.2198 KOps/s | |
test_setitem_dim[range] | 0.1618ms | 85.1056μs | 11.7501 KOps/s | 11.9766 KOps/s | |
test_setitem_dim[tuple] | 0.1041ms | 50.0711μs | 19.9716 KOps/s | 20.3466 KOps/s | |
test_setitem | 0.2107ms | 31.3189μs | 31.9296 KOps/s | 30.9466 KOps/s | |
test_set | 0.1324ms | 30.3797μs | 32.9167 KOps/s | 31.6886 KOps/s | |
test_set_shared | 3.7634ms | 0.2197ms | 4.5513 KOps/s | 4.5338 KOps/s | |
test_update | 0.1930ms | 39.3902μs | 25.3870 KOps/s | 24.5088 KOps/s | |
test_update_nested | 0.2557ms | 50.1345μs | 19.9463 KOps/s | 19.6060 KOps/s | |
test_update__nested | 0.8167ms | 44.6026μs | 22.4202 KOps/s | 22.1690 KOps/s | |
test_set_nested | 0.2182ms | 33.9591μs | 29.4471 KOps/s | 28.6354 KOps/s | |
test_set_nested_new | 0.2071ms | 39.0745μs | 25.5921 KOps/s | 25.3087 KOps/s | |
test_select | 0.1249ms | 57.0224μs | 17.5370 KOps/s | 17.5466 KOps/s | |
test_select_nested | 0.1330ms | 61.1437μs | 16.3549 KOps/s | 16.9287 KOps/s | |
test_exclude_nested | 0.1612ms | 76.2054μs | 13.1224 KOps/s | 13.3820 KOps/s | |
test_empty[True] | 0.9334ms | 0.3650ms | 2.7394 KOps/s | 2.8539 KOps/s | |
test_empty[False] | 12.6835μs | 1.3459μs | 742.9843 KOps/s | 828.8382 KOps/s | |
test_unbind_speed | 0.5927ms | 0.3205ms | 3.1202 KOps/s | 3.1872 KOps/s | |
test_unbind_speed_stack0 | 0.5217ms | 0.3200ms | 3.1253 KOps/s | 3.4434 KOps/s | |
test_unbind_speed_stack1 | 0.1066s | 0.8939ms | 1.1187 KOps/s | 1.3239 KOps/s | |
test_split | 0.1105s | 2.2062ms | 453.2651 Ops/s | 446.6058 Ops/s | |
test_chunk | 2.2593ms | 1.9894ms | 502.6708 Ops/s | 450.9001 Ops/s | |
test_creation[device0] | 0.2488ms | 0.1175ms | 8.5084 KOps/s | 8.2680 KOps/s | |
test_creation_from_tensor | 2.4658ms | 0.1176ms | 8.5040 KOps/s | 8.5309 KOps/s | |
test_add_one[memmap_tensor0] | 0.3475ms | 7.3433μs | 136.1779 KOps/s | 132.1683 KOps/s | |
test_contiguous[memmap_tensor0] | 25.5470μs | 1.8578μs | 538.2644 KOps/s | 514.5438 KOps/s | |
test_stack[memmap_tensor0] | 38.5920μs | 5.7597μs | 173.6202 KOps/s | 166.0962 KOps/s | |
test_memmaptd_index | 1.1201ms | 0.4156ms | 2.4062 KOps/s | 2.4456 KOps/s | |
test_memmaptd_index_astensor | 0.8266ms | 0.5182ms | 1.9299 KOps/s | 1.9521 KOps/s | |
test_memmaptd_index_op | 1.9167ms | 1.0631ms | 940.6159 Ops/s | 906.8116 Ops/s | |
test_serialize_model | 0.1274s | 0.1184s | 8.4428 Ops/s | 8.3237 Ops/s | |
test_serialize_model_pickle | 0.4391s | 0.4026s | 2.4837 Ops/s | 2.5012 Ops/s | |
test_serialize_weights | 0.1264s | 0.1168s | 8.5641 Ops/s | 7.4171 Ops/s | |
test_serialize_weights_returnearly | 0.1716s | 0.1608s | 6.2200 Ops/s | 6.2190 Ops/s | |
test_serialize_weights_pickle | 1.2491s | 0.7543s | 1.3257 Ops/s | 2.4801 Ops/s | |
test_serialize_weights_filesystem | 0.1518s | 0.1468s | 6.8116 Ops/s | 6.8018 Ops/s | |
test_serialize_model_filesystem | 0.1626s | 0.1488s | 6.7202 Ops/s | 5.9997 Ops/s | |
test_reshape_pytree | 83.2550μs | 38.5140μs | 25.9646 KOps/s | 26.3922 KOps/s | |
test_reshape_td | 0.1458ms | 47.4982μs | 21.0534 KOps/s | 21.3533 KOps/s | |
test_view_pytree | 0.1137ms | 38.4895μs | 25.9811 KOps/s | 26.4645 KOps/s | |
test_view_td | 0.1495ms | 53.5453μs | 18.6758 KOps/s | 19.6637 KOps/s | |
test_unbind_pytree | 81.6720μs | 35.5487μs | 28.1304 KOps/s | 28.0865 KOps/s | |
test_unbind_td | 0.3669ms | 46.6757μs | 21.4244 KOps/s | 22.8422 KOps/s | |
test_split_pytree | 0.1022ms | 37.9528μs | 26.3485 KOps/s | 26.9855 KOps/s | |
test_split_td | 0.6620ms | 58.8184μs | 17.0015 KOps/s | 17.4697 KOps/s | |
test_add_pytree | 0.1057ms | 44.6144μs | 22.4143 KOps/s | 21.2840 KOps/s | |
test_add_td | 0.3086ms | 87.2721μs | 11.4584 KOps/s | 10.6622 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1292ms | 58.8733μs | 16.9856 KOps/s | 17.0892 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 1.2885ms | 0.2058ms | 4.8587 KOps/s | 5.0873 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1460ms | 57.8573μs | 17.2839 KOps/s | 17.5063 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2633ms | 0.1400ms | 7.1410 KOps/s | 7.0326 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 60.2820μs | 23.5428μs | 42.4758 KOps/s | 42.7893 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1870ms | 73.7357μs | 13.5619 KOps/s | 13.7223 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.3019ms | 76.7849μs | 13.0234 KOps/s | 13.2391 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1237ms | 69.2665μs | 14.4370 KOps/s | 14.7260 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2954ms | 0.1839ms | 5.4391 KOps/s | 5.4539 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.4052ms | 0.2403ms | 4.1622 KOps/s | 4.1529 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1083ms | 48.7732μs | 20.5031 KOps/s | 20.4375 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.5342ms | 77.8095μs | 12.8519 KOps/s | 12.4605 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2885ms | 0.1766ms | 5.6616 KOps/s | 5.7683 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5164ms | 0.2868ms | 3.4869 KOps/s | 3.4779 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.5106ms | 0.2759ms | 3.6250 KOps/s | 3.5920 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3356ms | 0.1874ms | 5.3365 KOps/s | 5.5811 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.2515ms | 75.7161μs | 13.2072 KOps/s | 13.3171 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1947ms | 49.1591μs | 20.3421 KOps/s | 20.7412 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.5415ms | 0.2346ms | 4.2633 KOps/s | 4.2622 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2861ms | 0.1761ms | 5.6784 KOps/s | 5.5835 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.2247ms | 0.1118ms | 8.9447 KOps/s | 8.9869 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1474ms | 79.2196μs | 12.6231 KOps/s | 12.8831 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1441ms | 79.1601μs | 12.6326 KOps/s | 13.2677 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1242ms | 70.5874μs | 14.1668 KOps/s | 14.9946 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.2867ms | 0.1969ms | 5.0800 KOps/s | 5.2132 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.7342ms | 1.7540ms | 570.1185 Ops/s | 584.3041 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.2793ms | 0.1950ms | 5.1270 KOps/s | 5.1753 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.2529ms | 1.1068ms | 903.4927 Ops/s | 912.7660 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.5628ms | 0.4257ms | 2.3492 KOps/s | 2.3969 KOps/s | |
test_compile_assign_and_add_stack[eager] | 6.3882ms | 4.2024ms | 237.9618 Ops/s | 231.0427 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 92.0810μs | 35.0407μs | 28.5382 KOps/s | 28.8084 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.6303ms | 48.7879μs | 20.4969 KOps/s | 20.5725 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 76.2820μs | 31.5544μs | 31.6913 KOps/s | 32.4633 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 98.4630μs | 29.7289μs | 33.6373 KOps/s | 35.1193 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 87.1520μs | 31.5923μs | 31.6532 KOps/s | 32.5847 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 89.2960μs | 29.7613μs | 33.6007 KOps/s | 35.4145 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1400ms | 74.1429μs | 13.4875 KOps/s | 13.5755 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5566ms | 27.8144μs | 35.9525 KOps/s | 36.1683 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1747ms | 69.8109μs | 14.3244 KOps/s | 14.4711 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 68.1460μs | 23.4632μs | 42.6200 KOps/s | 43.9078 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1482ms | 70.0763μs | 14.2702 KOps/s | 14.4778 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 80.2300μs | 23.1115μs | 43.2685 KOps/s | 44.3936 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1491ms | 73.3826μs | 13.6272 KOps/s | 13.7837 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.9074ms | 27.3117μs | 36.6143 KOps/s | 36.1108 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1463ms | 69.1847μs | 14.4541 KOps/s | 14.4714 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 65.0810μs | 23.3005μs | 42.9175 KOps/s | 44.3649 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1511ms | 69.3337μs | 14.4230 KOps/s | 14.6782 KOps/s | |
test_compile_indexing[int-pytree-eager] | 86.9820μs | 23.3123μs | 42.8959 KOps/s | 44.9228 KOps/s | |
test_mod_add[eager] | 73.9080μs | 26.9842μs | 37.0587 KOps/s | 36.4473 KOps/s | |
test_mod_add[compile] | 95.2870μs | 39.6057μs | 25.2489 KOps/s | 25.8747 KOps/s | |
test_mod_add[compile-overhead] | 0.1114ms | 40.6875μs | 24.5776 KOps/s | 25.7934 KOps/s | |
test_mod_wrap[eager] | 0.3912ms | 0.2102ms | 4.7565 KOps/s | 4.7771 KOps/s | |
test_mod_wrap[compile] | 0.4998ms | 0.2381ms | 4.1993 KOps/s | 4.2884 KOps/s | |
test_mod_wrap[compile-overhead] | 0.4787ms | 0.2376ms | 4.2084 KOps/s | 4.4131 KOps/s | |
test_mod_wrap_and_backward[eager] | 13.0462ms | 11.0922ms | 90.1531 Ops/s | 76.9059 Ops/s | |
test_mod_wrap_and_backward[compile] | 12.5540ms | 11.1391ms | 89.7737 Ops/s | 79.8768 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 14.7358ms | 11.3310ms | 88.2537 Ops/s | 79.4496 Ops/s | |
test_seq_add[eager] | 0.2062ms | 96.8401μs | 10.3263 KOps/s | 10.4430 KOps/s | |
test_seq_add[compile] | 0.1333ms | 66.5799μs | 15.0196 KOps/s | 14.9757 KOps/s | |
test_seq_add[compile-overhead] | 0.1355ms | 65.4874μs | 15.2701 KOps/s | 15.6930 KOps/s | |
test_seq_wrap[eager] | 1.0723ms | 0.3959ms | 2.5261 KOps/s | 2.5205 KOps/s | |
test_seq_wrap[compile] | 0.5143ms | 0.2764ms | 3.6175 KOps/s | 3.6772 KOps/s | |
test_seq_wrap[compile-overhead] | 0.6283ms | 0.2772ms | 3.6075 KOps/s | 3.6767 KOps/s | |
test_func_call_runtime[False-eager] | 0.8074ms | 0.5281ms | 1.8936 KOps/s | 1.8606 KOps/s | |
test_func_call_runtime[False-compile] | 0.6316ms | 0.5078ms | 1.9693 KOps/s | 1.9879 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.6264ms | 0.5052ms | 1.9793 KOps/s | 2.0119 KOps/s | |
test_func_call_runtime[True-eager] | 1.3818ms | 0.7564ms | 1.3220 KOps/s | 1.3137 KOps/s | |
test_func_call_runtime[True-compile] | 0.7303ms | 0.5201ms | 1.9227 KOps/s | 1.9199 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.8476ms | 0.5270ms | 1.8976 KOps/s | 1.9029 KOps/s | |
test_func_call_cm_runtime[False-eager] | 1.0016ms | 0.5318ms | 1.8803 KOps/s | 1.8406 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.6768ms | 0.5081ms | 1.9679 KOps/s | 1.9492 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.7148ms | 0.5090ms | 1.9645 KOps/s | 1.9807 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.4040ms | 0.9205ms | 1.0864 KOps/s | 1.0871 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.0283ms | 0.7456ms | 1.3413 KOps/s | 1.3311 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.9318ms | 0.7390ms | 1.3531 KOps/s | 1.3198 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.7383ms | 1.9315ms | 517.7386 Ops/s | 506.8652 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 2.7934ms | 1.9870ms | 503.2639 Ops/s | 489.7777 Ops/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 3.3023ms | 2.0034ms | 499.1483 Ops/s | 488.3736 Ops/s | |
test_distributed | 0.3073ms | 0.1297ms | 7.7120 KOps/s | 7.6736 KOps/s | |
test_tdmodule | 90.0670μs | 17.9980μs | 55.5619 KOps/s | 50.3022 KOps/s | |
test_tdmodule_dispatch | 69.5800μs | 36.8207μs | 27.1586 KOps/s | 24.3357 KOps/s | |
test_tdseq | 46.5560μs | 20.6232μs | 48.4890 KOps/s | 44.1162 KOps/s | |
test_tdseq_dispatch | 74.6190μs | 42.7420μs | 23.3962 KOps/s | 22.3521 KOps/s | |
test_instantiation_functorch | 2.5894ms | 1.5830ms | 631.7114 Ops/s | 627.2070 Ops/s | |
test_exec_functorch | 0.4337ms | 0.1892ms | 5.2861 KOps/s | 5.3827 KOps/s | |
test_exec_functional_call | 0.2688ms | 0.1772ms | 5.6430 KOps/s | 5.6205 KOps/s | |
test_exec_td_decorator | 0.6035ms | 0.2350ms | 4.2546 KOps/s | 4.3080 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.4562ms | 0.6590ms | 1.5174 KOps/s | 1.5309 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.1141ms | 0.6570ms | 1.5222 KOps/s | 1.4814 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8519ms | 0.5397ms | 1.8530 KOps/s | 1.8583 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.9726ms | 0.5374ms | 1.8608 KOps/s | 1.8655 KOps/s | |
test_to_module_speed[True] | 2.5408ms | 1.4536ms | 687.9291 Ops/s | 700.6840 Ops/s | |
test_to_module_speed[False] | 1.5200ms | 1.3934ms | 717.6720 Ops/s | 728.8604 Ops/s | |
test_tc_init | 98.7640μs | 47.5293μs | 21.0396 KOps/s | 19.7238 KOps/s | |
test_tc_init_nested | 0.1757ms | 96.0922μs | 10.4067 KOps/s | 9.9529 KOps/s | |
test_tc_first_layer_tensor | 33.8430μs | 1.5471μs | 646.3876 KOps/s | 647.4782 KOps/s | |
test_tc_first_layer_nontensor | 29.2240μs | 4.6780μs | 213.7655 KOps/s | 220.5462 KOps/s | |
test_tc_second_layer_tensor | 25.2170μs | 2.8414μs | 351.9343 KOps/s | 358.0101 KOps/s | |
test_tc_second_layer_nontensor | 0.1693ms | 6.0018μs | 166.6175 KOps/s | 169.6431 KOps/s | |
test_unbind | 8.3194ms | 7.7398ms | 129.2020 Ops/s | 73.2024 Ops/s | |
test_full_like | 23.2310ms | 13.9526ms | 71.6714 Ops/s | 119.8505 Ops/s | |
test_zeros_like | 15.5192ms | 8.0245ms | 124.6191 Ops/s | 313.2111 Ops/s | |
test_ones_like | 17.0311ms | 8.1184ms | 123.1768 Ops/s | 120.9727 Ops/s | |
test_clone | 16.5694ms | 10.2050ms | 97.9913 Ops/s | 97.2555 Ops/s | |
test_squeeze | 72.6950μs | 12.3614μs | 80.8970 KOps/s | 82.6306 KOps/s | |
test_unsqueeze | 0.1742ms | 92.0910μs | 10.8588 KOps/s | 10.7289 KOps/s | |
test_split | 0.4974ms | 0.1921ms | 5.2052 KOps/s | 4.9972 KOps/s | |
test_permute | 0.3670ms | 0.2178ms | 4.5903 KOps/s | 4.4803 KOps/s | |
test_stack | 34.6362ms | 28.4559ms | 35.1421 Ops/s | 36.3687 Ops/s | |
test_cat | 36.8149ms | 28.2119ms | 35.4461 Ops/s | 35.9415 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1455ms | 16.9090μs | 59.1401 KOps/s | 59.9299 KOps/s | |
test_plain_set_stack_nested | 45.4400μs | 17.0521μs | 58.6437 KOps/s | 59.4081 KOps/s | |
test_plain_set_nested_inplace | 48.6500μs | 18.1886μs | 54.9795 KOps/s | 55.5015 KOps/s | |
test_plain_set_stack_nested_inplace | 48.8400μs | 18.2093μs | 54.9169 KOps/s | 55.9320 KOps/s | |
test_items | 25.7610μs | 2.8956μs | 345.3459 KOps/s | 340.5744 KOps/s | |
test_items_nested | 0.3721ms | 0.3406ms | 2.9356 KOps/s | 2.9664 KOps/s | |
test_items_nested_locked | 0.3749ms | 0.3432ms | 2.9137 KOps/s | 2.9372 KOps/s | |
test_items_nested_leaf | 0.1137ms | 63.7953μs | 15.6751 KOps/s | 15.9089 KOps/s | |
test_items_stack_nested | 0.5136ms | 0.3440ms | 2.9068 KOps/s | 2.9190 KOps/s | |
test_items_stack_nested_leaf | 93.4410μs | 64.6424μs | 15.4697 KOps/s | 15.5473 KOps/s | |
test_items_stack_nested_locked | 0.3901ms | 0.3419ms | 2.9245 KOps/s | 2.9203 KOps/s | |
test_keys | 23.3700μs | 3.4059μs | 293.6054 KOps/s | 273.1073 KOps/s | |
test_keys_nested | 97.1610μs | 71.6385μs | 13.9590 KOps/s | 14.1039 KOps/s | |
test_keys_nested_locked | 0.8463ms | 76.9576μs | 12.9942 KOps/s | 12.9290 KOps/s | |
test_keys_nested_leaf | 93.5310μs | 62.2056μs | 16.0757 KOps/s | 16.1785 KOps/s | |
test_keys_stack_nested | 0.1105ms | 71.0869μs | 14.0673 KOps/s | 14.0431 KOps/s | |
test_keys_stack_nested_leaf | 91.2110μs | 64.4255μs | 15.5218 KOps/s | 15.7982 KOps/s | |
test_keys_stack_nested_locked | 0.1096ms | 77.8614μs | 12.8433 KOps/s | 12.8803 KOps/s | |
test_values | 5.1150μs | 0.8373μs | 1.1944 MOps/s | 1.1734 MOps/s | |
test_values_nested | 89.9500μs | 49.0904μs | 20.3706 KOps/s | 20.4056 KOps/s | |
test_values_nested_locked | 84.7900μs | 50.9008μs | 19.6460 KOps/s | 19.8632 KOps/s | |
test_values_nested_leaf | 70.5010μs | 42.8154μs | 23.3561 KOps/s | 23.4080 KOps/s | |
test_values_stack_nested | 95.2500μs | 50.7651μs | 19.6986 KOps/s | 19.9352 KOps/s | |
test_values_stack_nested_leaf | 79.0400μs | 44.2564μs | 22.5956 KOps/s | 22.9851 KOps/s | |
test_values_stack_nested_locked | 79.1810μs | 52.1548μs | 19.1737 KOps/s | 19.4990 KOps/s | |
test_membership | 2.0945μs | 0.5021μs | 1.9917 MOps/s | 1.9896 MOps/s | |
test_membership_nested | 19.6900μs | 1.8538μs | 539.4205 KOps/s | 524.0761 KOps/s | |
test_membership_nested_leaf | 18.4603μs | 1.8281μs | 547.0020 KOps/s | 553.0605 KOps/s | |
test_membership_stacked_nested | 16.4610μs | 1.9054μs | 524.8125 KOps/s | 526.3816 KOps/s | |
test_membership_stacked_nested_leaf | 25.9400μs | 1.9251μs | 519.4411 KOps/s | 520.3190 KOps/s | |
test_membership_nested_last | 29.9000μs | 2.9446μs | 339.6010 KOps/s | 335.0828 KOps/s | |
test_membership_nested_leaf_last | 34.7100μs | 2.9578μs | 338.0859 KOps/s | 332.2136 KOps/s | |
test_membership_stacked_nested_last | 30.4700μs | 2.9335μs | 340.8888 KOps/s | 263.8504 KOps/s | |
test_membership_stacked_nested_leaf_last | 30.8210μs | 2.9597μs | 337.8752 KOps/s | 267.7475 KOps/s | |
test_nested_getleaf | 28.4400μs | 6.1052μs | 163.7942 KOps/s | 165.8281 KOps/s | |
test_nested_get | 36.1100μs | 5.7628μs | 173.5255 KOps/s | 175.3144 KOps/s | |
test_stacked_getleaf | 31.2200μs | 5.9871μs | 167.0271 KOps/s | 165.9284 KOps/s | |
test_stacked_get | 33.4010μs | 5.5842μs | 179.0753 KOps/s | 177.5318 KOps/s | |
test_nested_getitemleaf | 27.5810μs | 6.1545μs | 162.4827 KOps/s | 163.6222 KOps/s | |
test_nested_getitem | 35.7300μs | 5.7886μs | 172.7541 KOps/s | 174.0278 KOps/s | |
test_stacked_getitemleaf | 36.9400μs | 6.1335μs | 163.0386 KOps/s | 165.0611 KOps/s | |
test_stacked_getitem | 30.4900μs | 5.7489μs | 173.9466 KOps/s | 174.6411 KOps/s | |
test_lock_nested | 0.8723ms | 0.4336ms | 2.3061 KOps/s | 2.3084 KOps/s | |
test_lock_stack_nested | 0.4480ms | 0.3960ms | 2.5250 KOps/s | 2.5353 KOps/s | |
test_unlock_nested | 0.8427ms | 0.3739ms | 2.6742 KOps/s | 2.7107 KOps/s | |
test_unlock_stack_nested | 0.4002ms | 0.3356ms | 2.9799 KOps/s | 2.9997 KOps/s | |
test_flatten_speed | 0.1548ms | 76.6394μs | 13.0481 KOps/s | 13.0112 KOps/s | |
test_unflatten_speed | 0.3661ms | 0.3246ms | 3.0811 KOps/s | 3.1332 KOps/s | |
test_common_ops | 1.7620ms | 1.3077ms | 764.7159 Ops/s | 771.9915 Ops/s | |
test_creation | 27.7300μs | 1.5014μs | 666.0490 KOps/s | 668.7669 KOps/s | |
test_creation_empty | 45.6000μs | 16.1936μs | 61.7529 KOps/s | 64.0031 KOps/s | |
test_creation_nested_1 | 47.2100μs | 18.1614μs | 55.0620 KOps/s | 58.3236 KOps/s | |
test_creation_nested_2 | 50.0400μs | 20.9522μs | 47.7277 KOps/s | 50.4056 KOps/s | |
test_clone | 0.1078ms | 30.3450μs | 32.9544 KOps/s | 33.4907 KOps/s | |
test_getitem[int] | 1.2715ms | 15.8808μs | 62.9691 KOps/s | 61.9973 KOps/s | |
test_getitem[slice_int] | 0.1167ms | 27.4818μs | 36.3878 KOps/s | 36.2327 KOps/s | |
test_getitem[range] | 0.1565ms | 0.1101ms | 9.0799 KOps/s | 8.9529 KOps/s | |
test_getitem[tuple] | 0.1161ms | 24.1641μs | 41.3837 KOps/s | 40.6976 KOps/s | |
test_getitem[list] | 0.2052ms | 0.1011ms | 9.8903 KOps/s | 9.9140 KOps/s | |
test_setitem_dim[int] | 72.6110μs | 45.9547μs | 21.7606 KOps/s | 21.8131 KOps/s | |
test_setitem_dim[slice_int] | 90.8410μs | 67.6977μs | 14.7716 KOps/s | 14.6474 KOps/s | |
test_setitem_dim[range] | 0.1586ms | 0.1300ms | 7.6915 KOps/s | 7.7157 KOps/s | |
test_setitem_dim[tuple] | 0.1023ms | 61.8220μs | 16.1755 KOps/s | 16.0679 KOps/s | |
test_setitem | 90.4100μs | 44.5168μs | 22.4634 KOps/s | 23.2550 KOps/s | |
test_set | 88.7210μs | 47.0298μs | 21.2631 KOps/s | 24.1544 KOps/s | |
test_set_shared | 0.3858ms | 60.0430μs | 16.6547 KOps/s | 18.2958 KOps/s | |
test_update | 95.1600μs | 56.3266μs | 17.7536 KOps/s | 19.5420 KOps/s | |
test_update_nested | 0.1045ms | 64.3558μs | 15.5386 KOps/s | 17.0307 KOps/s | |
test_update__nested | 0.1871ms | 63.1737μs | 15.8294 KOps/s | 15.8930 KOps/s | |
test_set_nested | 97.5310μs | 45.2791μs | 22.0852 KOps/s | 22.3337 KOps/s | |
test_set_nested_new | 99.6510μs | 49.6334μs | 20.1477 KOps/s | 20.9941 KOps/s | |
test_select | 0.1033ms | 62.5070μs | 15.9982 KOps/s | 16.4412 KOps/s | |
test_select_nested | 78.6900μs | 41.6091μs | 24.0332 KOps/s | 23.7177 KOps/s | |
test_exclude_nested | 0.5184ms | 60.7567μs | 16.4591 KOps/s | 17.1078 KOps/s | |
test_empty[True] | 0.2988ms | 0.2594ms | 3.8545 KOps/s | 3.8590 KOps/s | |
test_empty[False] | 3.6480μs | 0.8437μs | 1.1853 MOps/s | 1.3587 MOps/s | |
test_to | 56.7500μs | 27.6397μs | 36.1798 KOps/s | 37.6510 KOps/s | |
test_to_nonblocking | 56.8800μs | 26.9051μs | 37.1677 KOps/s | 40.8404 KOps/s | |
test_unbind_speed | 1.2464ms | 0.2854ms | 3.5039 KOps/s | 3.5394 KOps/s | |
test_unbind_speed_stack0 | 0.3210ms | 0.2799ms | 3.5727 KOps/s | 3.5468 KOps/s | |
test_unbind_speed_stack1 | 92.4051ms | 0.7214ms | 1.3861 KOps/s | 1.4032 KOps/s | |
test_split | 93.6898ms | 2.1896ms | 456.7096 Ops/s | 456.7079 Ops/s | |
test_chunk | 95.7193ms | 2.1716ms | 460.4883 Ops/s | 449.1022 Ops/s | |
test_creation[device0] | 0.3299ms | 0.1283ms | 7.7955 KOps/s | 7.3847 KOps/s | |
test_creation_from_tensor | 0.3534ms | 0.1297ms | 7.7075 KOps/s | 7.3874 KOps/s | |
test_add_one[memmap_tensor0] | 0.2947ms | 9.1390μs | 109.4206 KOps/s | 112.3311 KOps/s | |
test_contiguous[memmap_tensor0] | 29.9600μs | 2.1996μs | 454.6291 KOps/s | 446.2509 KOps/s | |
test_stack[memmap_tensor0] | 34.7400μs | 6.9111μs | 144.6940 KOps/s | 147.8753 KOps/s | |
test_memmaptd_index | 1.1135ms | 0.4357ms | 2.2951 KOps/s | 2.2887 KOps/s | |
test_memmaptd_index_astensor | 0.7584ms | 0.5087ms | 1.9656 KOps/s | 1.9775 KOps/s | |
test_memmaptd_index_op | 1.4869ms | 1.0628ms | 940.9472 Ops/s | 957.4025 Ops/s | |
test_serialize_model | 0.1320s | 0.1302s | 7.6818 Ops/s | 7.6702 Ops/s | |
test_serialize_model_pickle | 1.3477s | 1.2149s | 0.8231 Ops/s | 0.8217 Ops/s | |
test_serialize_weights | 0.1304s | 0.1298s | 7.7024 Ops/s | 7.7090 Ops/s | |
test_serialize_weights_returnearly | 0.2338s | 63.8411ms | 15.6639 Ops/s | 21.7070 Ops/s | |
test_serialize_weights_pickle | 1.3464s | 1.1869s | 0.8426 Ops/s | 0.8212 Ops/s | |
test_reshape_pytree | 68.9510μs | 35.9509μs | 27.8157 KOps/s | 27.2994 KOps/s | |
test_reshape_td | 64.7700μs | 41.6206μs | 24.0266 KOps/s | 23.5485 KOps/s | |
test_view_pytree | 68.1410μs | 35.4729μs | 28.1905 KOps/s | 27.7521 KOps/s | |
test_view_td | 87.3110μs | 47.9377μs | 20.8604 KOps/s | 21.5910 KOps/s | |
test_unbind_pytree | 62.5610μs | 33.5538μs | 29.8029 KOps/s | 28.9633 KOps/s | |
test_unbind_td | 0.4897ms | 43.8448μs | 22.8077 KOps/s | 22.6734 KOps/s | |
test_split_pytree | 73.4500μs | 44.9214μs | 22.2611 KOps/s | 21.8305 KOps/s | |
test_split_td | 0.6885ms | 55.7707μs | 17.9306 KOps/s | 17.8082 KOps/s | |
test_add_pytree | 0.1034ms | 58.9430μs | 16.9655 KOps/s | 17.1917 KOps/s | |
test_add_td | 0.1348ms | 96.8157μs | 10.3289 KOps/s | 10.5965 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.2103ms | 0.1615ms | 6.1908 KOps/s | 6.1999 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3412ms | 0.1615ms | 6.1909 KOps/s | 6.1510 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.2078ms | 0.1553ms | 6.4382 KOps/s | 6.4377 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.5916ms | 0.1941ms | 5.1510 KOps/s | 5.3417 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.4134ms | 21.9716μs | 45.5132 KOps/s | 46.5544 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.4353ms | 49.5398μs | 20.1858 KOps/s | 20.7673 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.4489ms | 65.4155μs | 15.2869 KOps/s | 15.4423 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.4339ms | 49.5775μs | 20.1705 KOps/s | 20.0369 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.3693ms | 0.3211ms | 3.1142 KOps/s | 3.1267 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.6151ms | 0.2325ms | 4.3016 KOps/s | 4.2608 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1834ms | 0.1289ms | 7.7551 KOps/s | 7.7670 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.4560ms | 66.6373μs | 15.0066 KOps/s | 14.8355 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.7377ms | 0.3302ms | 3.0280 KOps/s | 3.0441 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 1.0559ms | 0.6605ms | 1.5139 KOps/s | 1.5582 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.6944ms | 0.2854ms | 3.5037 KOps/s | 3.5109 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3986ms | 0.3242ms | 3.0846 KOps/s | 3.1045 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.4973ms | 78.1709μs | 12.7925 KOps/s | 12.6191 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1934ms | 0.1318ms | 7.5890 KOps/s | 7.5102 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6651ms | 0.5569ms | 1.7956 KOps/s | 1.8717 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3799ms | 0.3279ms | 3.0501 KOps/s | 3.0493 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 52.2000μs | 21.1580μs | 47.2634 KOps/s | 46.1194 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 94.0310μs | 38.8919μs | 25.7123 KOps/s | 26.0708 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1142ms | 70.7920μs | 14.1259 KOps/s | 14.2819 KOps/s | |
test_compile_copy_flat[pytree-eager] | 86.8700μs | 52.0999μs | 19.1939 KOps/s | 19.3094 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.3703ms | 0.8291ms | 1.2061 KOps/s | 1.1142 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.3489ms | 3.2448ms | 308.1824 Ops/s | 301.8191 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.3956ms | 0.8404ms | 1.1899 KOps/s | 1.0914 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.4542ms | 3.3412ms | 299.2960 Ops/s | 302.1734 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.2623ms | 0.1207ms | 8.2878 KOps/s | 8.4021 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.1983ms | 64.7590μs | 15.4419 KOps/s | 15.2296 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1604ms | 0.1154ms | 8.6663 KOps/s | 8.3779 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1087ms | 47.0703μs | 21.2448 KOps/s | 20.5564 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1605ms | 0.1189ms | 8.4136 KOps/s | 8.3278 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 91.9010μs | 46.8301μs | 21.3538 KOps/s | 20.6768 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1898ms | 0.1482ms | 6.7498 KOps/s | 6.8547 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1505ms | 24.2609μs | 41.2186 KOps/s | 39.7948 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1910ms | 0.1436ms | 6.9628 KOps/s | 7.1399 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 60.0510μs | 20.9214μs | 47.7981 KOps/s | 46.9555 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.2016ms | 0.1452ms | 6.8862 KOps/s | 7.0965 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 66.1210μs | 20.6572μs | 48.4094 KOps/s | 47.6845 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2798ms | 0.1506ms | 6.6421 KOps/s | 6.7805 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5104ms | 24.0576μs | 41.5669 KOps/s | 40.4454 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1908ms | 0.1458ms | 6.8570 KOps/s | 6.7937 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 52.3010μs | 20.7712μs | 48.1435 KOps/s | 48.1111 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1875ms | 0.1410ms | 7.0915 KOps/s | 6.8702 KOps/s | |
test_compile_indexing[int-pytree-eager] | 54.1600μs | 20.6073μs | 48.5264 KOps/s | 45.3173 KOps/s | |
test_mod_add[eager] | 75.4110μs | 34.0144μs | 29.3993 KOps/s | 29.8975 KOps/s | |
test_mod_add[compile] | 0.1300ms | 83.9008μs | 11.9188 KOps/s | 12.3315 KOps/s | |
test_mod_add[compile-overhead] | 0.3086ms | 0.1541ms | 6.4910 KOps/s | 6.3617 KOps/s | |
test_mod_wrap[eager] | 0.3678ms | 0.2459ms | 4.0664 KOps/s | 3.9926 KOps/s | |
test_mod_wrap[compile] | 1.4557ms | 0.3086ms | 3.2405 KOps/s | 3.2780 KOps/s | |
test_mod_wrap[compile-overhead] | 7.7718ms | 4.0832ms | 244.9030 Ops/s | 244.4356 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.7722ms | 1.3872ms | 720.8765 Ops/s | 677.7776 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.7679ms | 1.3409ms | 745.7887 Ops/s | 687.8552 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3393ms | 0.9089ms | 1.1003 KOps/s | 986.1842 Ops/s | |
test_seq_add[eager] | 0.1702ms | 0.1018ms | 9.8186 KOps/s | 10.0245 KOps/s | |
test_seq_add[compile] | 0.1392ms | 91.0130μs | 10.9874 KOps/s | 10.9788 KOps/s | |
test_seq_add[compile-overhead] | 0.1685ms | 0.1245ms | 8.0337 KOps/s | 8.0065 KOps/s | |
test_seq_wrap[eager] | 0.4517ms | 0.3832ms | 2.6099 KOps/s | 2.4080 KOps/s | |
test_seq_wrap[compile] | 0.3635ms | 0.3150ms | 3.1750 KOps/s | 3.0712 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2637ms | 0.2195ms | 4.5561 KOps/s | 4.4608 KOps/s | |
test_func_call_runtime[False-eager] | 0.8499ms | 0.7504ms | 1.3326 KOps/s | 1.3174 KOps/s | |
test_func_call_runtime[False-compile] | 0.9698ms | 0.7881ms | 1.2689 KOps/s | 1.2396 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4178ms | 0.3629ms | 2.7554 KOps/s | 2.7390 KOps/s | |
test_func_call_runtime[True-eager] | 0.9675ms | 0.9067ms | 1.1029 KOps/s | 1.0766 KOps/s | |
test_func_call_runtime[True-compile] | 0.9460ms | 0.8134ms | 1.2294 KOps/s | 1.1638 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4321ms | 0.3847ms | 2.5992 KOps/s | 2.6032 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8175ms | 0.7386ms | 1.3538 KOps/s | 1.3133 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8833ms | 0.7911ms | 1.2641 KOps/s | 1.2408 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4141ms | 0.3637ms | 2.7497 KOps/s | 2.7413 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1194ms | 1.0165ms | 983.7746 Ops/s | 959.3121 Ops/s | |
test_func_call_cm_runtime[True-compile] | 0.8951ms | 0.8382ms | 1.1931 KOps/s | 1.1559 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.4557ms | 0.4093ms | 2.4435 KOps/s | 2.4247 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5795ms | 2.1220ms | 471.2546 Ops/s | 463.3930 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9134ms | 0.8552ms | 1.1693 KOps/s | 1.1600 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4682ms | 0.4137ms | 2.4174 KOps/s | 2.4096 KOps/s | |
test_distributed | 0.8823ms | 0.1581ms | 6.3264 KOps/s | 8.8268 KOps/s | |
test_tdmodule | 35.0110μs | 14.8771μs | 67.2174 KOps/s | 64.6674 KOps/s | |
test_tdmodule_dispatch | 49.6900μs | 29.4736μs | 33.9287 KOps/s | 32.4082 KOps/s | |
test_tdseq | 37.7400μs | 16.2875μs | 61.3967 KOps/s | 62.7185 KOps/s | |
test_tdseq_dispatch | 55.1910μs | 32.6905μs | 30.5900 KOps/s | 31.2895 KOps/s | |
test_instantiation_functorch | 2.0362ms | 1.8941ms | 527.9653 Ops/s | 522.6955 Ops/s | |
test_exec_functorch | 0.2582ms | 0.2120ms | 4.7170 KOps/s | 4.6857 KOps/s | |
test_exec_functional_call | 0.2894ms | 0.2075ms | 4.8193 KOps/s | 4.6728 KOps/s | |
test_exec_td_decorator | 0.4476ms | 0.2629ms | 3.8039 KOps/s | 3.7677 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8220ms | 0.6920ms | 1.4451 KOps/s | 1.4228 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8375ms | 0.6951ms | 1.4386 KOps/s | 1.3882 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7909ms | 0.6287ms | 1.5906 KOps/s | 1.5921 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7496ms | 0.6189ms | 1.6157 KOps/s | 1.6178 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 20.6934ms | 19.9301ms | 50.1752 Ops/s | 49.9762 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 20.1204ms | 19.9235ms | 50.1919 Ops/s | 49.8920 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 20.6135ms | 19.8065ms | 50.4884 Ops/s | 50.2952 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 20.1576ms | 19.8101ms | 50.4794 Ops/s | 50.2582 Ops/s | |
test_to_module_speed[True] | 1.3865ms | 0.9945ms | 1.0055 KOps/s | 981.5413 Ops/s | |
test_to_module_speed[False] | 1.4250ms | 0.9870ms | 1.0131 KOps/s | 1.0040 KOps/s | |
test_tc_init | 71.3310μs | 35.4989μs | 28.1699 KOps/s | 28.5094 KOps/s | |
test_tc_init_nested | 0.1119ms | 70.3091μs | 14.2229 KOps/s | 14.4859 KOps/s | |
test_tc_first_layer_tensor | 5.6757μs | 0.6759μs | 1.4796 MOps/s | 1.4945 MOps/s | |
test_tc_first_layer_nontensor | 31.8300μs | 2.2672μs | 441.0716 KOps/s | 451.2286 KOps/s | |
test_tc_second_layer_tensor | 7.1275μs | 1.3699μs | 729.9750 KOps/s | 723.9189 KOps/s | |
test_tc_second_layer_nontensor | 27.9900μs | 2.9766μs | 335.9504 KOps/s | 337.3669 KOps/s | |
test_unbind | 0.1913s | 9.6159ms | 103.9942 Ops/s | 92.4687 Ops/s | |
test_full_like | 0.6570ms | 0.5744ms | 1.7410 KOps/s | 1.7429 KOps/s | |
test_zeros_like | 0.2611ms | 0.1980ms | 5.0511 KOps/s | 5.0481 KOps/s | |
test_ones_like | 0.2421ms | 0.1977ms | 5.0569 KOps/s | 5.0530 KOps/s | |
test_clone | 0.4489ms | 0.4148ms | 2.4106 KOps/s | 2.4096 KOps/s | |
test_squeeze | 39.7200μs | 9.7707μs | 102.3469 KOps/s | 99.7941 KOps/s | |
test_unsqueeze | 0.2202ms | 73.9952μs | 13.5144 KOps/s | 13.3027 KOps/s | |
test_split | 0.3929ms | 0.1578ms | 6.3364 KOps/s | 6.2583 KOps/s | |
test_permute | 0.2327ms | 0.1875ms | 5.3331 KOps/s | 5.6208 KOps/s | |
test_stack | 1.2855ms | 0.8591ms | 1.1640 KOps/s | 1.1712 KOps/s | |
test_cat | 1.2556ms | 1.2313ms | 812.1371 Ops/s | 811.9746 Ops/s |
vmoens
added a commit
that referenced
this pull request
Oct 11, 2024
ghstack-source-id: 672dc8b82c0b025feab98d061b5241536fa040c0 Pull Request resolved: #1037
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):