-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Fix mem leak when locking #1188
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Jan 21, 2025
ghstack-source-id: 7fb551a371fbd44a695005a9c8b0976dd061bcb4 Pull Request resolved: #1188
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jan 21, 2025
vmoens
added a commit
that referenced
this pull request
Jan 21, 2025
ghstack-source-id: d6e44e1d9b9afc9903a0f45945c10a94dcf5a0ca Pull Request resolved: #1188
3 tasks
3 tasks
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.4007ms | 11.6004μs | 86.2043 KOps/s | 75.5907 KOps/s | |
test_plain_set_stack_nested | 47.3710μs | 11.6794μs | 85.6211 KOps/s | 74.7623 KOps/s | |
test_plain_set_nested_inplace | 0.4051ms | 12.5074μs | 79.9529 KOps/s | 69.8456 KOps/s | |
test_plain_set_stack_nested_inplace | 34.3610μs | 12.6287μs | 79.1845 KOps/s | 69.4316 KOps/s | |
test_items | 0.3871ms | 3.0380μs | 329.1686 KOps/s | 341.1411 KOps/s | |
test_items_nested | 0.7512ms | 0.3582ms | 2.7920 KOps/s | 2.7623 KOps/s | |
test_items_nested_locked | 0.7547ms | 0.3629ms | 2.7556 KOps/s | 2.7799 KOps/s | |
test_items_nested_leaf | 0.1110ms | 59.8400μs | 16.7112 KOps/s | 17.1530 KOps/s | |
test_items_stack_nested | 0.7762ms | 0.3612ms | 2.7689 KOps/s | 2.7748 KOps/s | |
test_items_stack_nested_leaf | 0.4596ms | 61.3879μs | 16.2899 KOps/s | 16.4970 KOps/s | |
test_items_stack_nested_locked | 0.7531ms | 0.3603ms | 2.7757 KOps/s | 2.7367 KOps/s | |
test_keys | 0.3981ms | 3.5027μs | 285.4909 KOps/s | 285.7347 KOps/s | |
test_keys_nested | 0.4844ms | 90.2888μs | 11.0756 KOps/s | 11.1242 KOps/s | |
test_keys_nested_locked | 0.7236ms | 95.7704μs | 10.4416 KOps/s | 10.3903 KOps/s | |
test_keys_nested_leaf | 0.1091ms | 80.2680μs | 12.4583 KOps/s | 12.4306 KOps/s | |
test_keys_stack_nested | 0.4889ms | 91.8484μs | 10.8875 KOps/s | 11.0673 KOps/s | |
test_keys_stack_nested_leaf | 0.4873ms | 82.7098μs | 12.0905 KOps/s | 12.2298 KOps/s | |
test_keys_stack_nested_locked | 0.5027ms | 97.1946μs | 10.2886 KOps/s | 10.3802 KOps/s | |
test_values | 67.3212μs | 0.8670μs | 1.1534 MOps/s | 1.1495 MOps/s | |
test_values_nested | 0.4413ms | 38.5244μs | 25.9576 KOps/s | 26.3171 KOps/s | |
test_values_nested_locked | 0.4383ms | 39.9293μs | 25.0443 KOps/s | 25.0943 KOps/s | |
test_values_nested_leaf | 71.0810μs | 42.5072μs | 23.5254 KOps/s | 23.6508 KOps/s | |
test_values_stack_nested | 0.4409ms | 39.2632μs | 25.4692 KOps/s | 25.7649 KOps/s | |
test_values_stack_nested_leaf | 0.4631ms | 43.1626μs | 23.1682 KOps/s | 23.2910 KOps/s | |
test_values_stack_nested_locked | 0.4436ms | 40.9115μs | 24.4430 KOps/s | 24.6781 KOps/s | |
test_membership | 20.1368μs | 0.5101μs | 1.9604 MOps/s | 1.9667 MOps/s | |
test_membership_nested | 0.2026ms | 2.0371μs | 490.9030 KOps/s | 486.0788 KOps/s | |
test_membership_nested_leaf | 0.2103ms | 2.0458μs | 488.8125 KOps/s | 498.3231 KOps/s | |
test_membership_stacked_nested | 40.5510μs | 2.1199μs | 471.7123 KOps/s | 481.6687 KOps/s | |
test_membership_stacked_nested_leaf | 24.9610μs | 2.0909μs | 478.2598 KOps/s | 480.8882 KOps/s | |
test_membership_nested_last | 0.4127ms | 3.0797μs | 324.7094 KOps/s | 325.7793 KOps/s | |
test_membership_nested_leaf_last | 25.7300μs | 3.0868μs | 323.9608 KOps/s | 326.7797 KOps/s | |
test_membership_stacked_nested_last | 0.4136ms | 3.0979μs | 322.8014 KOps/s | 133.6521 KOps/s | |
test_membership_stacked_nested_leaf_last | 27.6900μs | 3.0630μs | 326.4759 KOps/s | 134.5825 KOps/s | |
test_nested_getleaf | 0.4117ms | 6.1747μs | 161.9514 KOps/s | 164.1918 KOps/s | |
test_nested_get | 25.7500μs | 5.7847μs | 172.8702 KOps/s | 172.3890 KOps/s | |
test_stacked_getleaf | 0.4109ms | 6.1201μs | 163.3948 KOps/s | 161.5493 KOps/s | |
test_stacked_get | 40.1610μs | 5.7802μs | 173.0031 KOps/s | 172.6725 KOps/s | |
test_nested_getitemleaf | 0.4075ms | 6.4226μs | 155.7013 KOps/s | 156.6237 KOps/s | |
test_nested_getitem | 28.9710μs | 6.1246μs | 163.2758 KOps/s | 163.1197 KOps/s | |
test_stacked_getitemleaf | 0.4207ms | 6.3723μs | 156.9300 KOps/s | 156.0931 KOps/s | |
test_stacked_getitem | 28.9100μs | 6.0679μs | 164.8019 KOps/s | 162.9010 KOps/s | |
test_lock_nested | 0.4267ms | 0.3450ms | 2.8988 KOps/s | 2.6767 KOps/s | |
test_lock_stack_nested | 0.3962ms | 0.3515ms | 2.8451 KOps/s | 2.9267 KOps/s | |
test_unlock_nested | 0.4071ms | 0.2894ms | 3.4555 KOps/s | 3.1890 KOps/s | |
test_unlock_stack_nested | 0.6904ms | 0.2904ms | 3.4438 KOps/s | 3.5664 KOps/s | |
test_flatten_speed | 0.4609ms | 77.0830μs | 12.9730 KOps/s | 13.2657 KOps/s | |
test_unflatten_speed | 0.7207ms | 0.3206ms | 3.1193 KOps/s | 3.1196 KOps/s | |
test_common_ops | 1.0045ms | 0.5991ms | 1.6692 KOps/s | 1.5331 KOps/s | |
test_creation | 0.1147ms | 1.7814μs | 561.3499 KOps/s | 563.6669 KOps/s | |
test_creation_empty | 37.4300μs | 6.9713μs | 143.4446 KOps/s | 96.9585 KOps/s | |
test_creation_nested_1 | 37.8110μs | 8.6147μs | 116.0809 KOps/s | 82.5322 KOps/s | |
test_creation_nested_2 | 0.4163ms | 11.3797μs | 87.8757 KOps/s | 67.7719 KOps/s | |
test_clone | 65.4320μs | 11.0464μs | 90.5270 KOps/s | 95.0749 KOps/s | |
test_getitem[int] | 1.2572ms | 10.9214μs | 91.5633 KOps/s | 91.4793 KOps/s | |
test_getitem[slice_int] | 0.4304ms | 21.6899μs | 46.1044 KOps/s | 47.5980 KOps/s | |
test_getitem[range] | 0.1293ms | 38.0332μs | 26.2928 KOps/s | 26.8218 KOps/s | |
test_getitem[tuple] | 0.1111ms | 18.4844μs | 54.0997 KOps/s | 54.2345 KOps/s | |
test_getitem[list] | 0.4543ms | 35.7527μs | 27.9699 KOps/s | 30.3281 KOps/s | |
test_setitem_dim[int] | 44.6800μs | 22.1920μs | 45.0613 KOps/s | 50.4938 KOps/s | |
test_setitem_dim[slice_int] | 63.3110μs | 41.8032μs | 23.9216 KOps/s | 25.8223 KOps/s | |
test_setitem_dim[range] | 94.5110μs | 57.9189μs | 17.2655 KOps/s | 19.3175 KOps/s | |
test_setitem_dim[tuple] | 58.7310μs | 35.6326μs | 28.0642 KOps/s | 30.0027 KOps/s | |
test_setitem | 0.4253ms | 15.0684μs | 66.3641 KOps/s | 59.9753 KOps/s | |
test_set | 66.9710μs | 14.2530μs | 70.1607 KOps/s | 61.5582 KOps/s | |
test_set_shared | 0.7207ms | 0.1619ms | 6.1763 KOps/s | 6.6097 KOps/s | |
test_update | 0.3946ms | 16.5097μs | 60.5705 KOps/s | 50.1657 KOps/s | |
test_update_nested | 0.4309ms | 22.5284μs | 44.3885 KOps/s | 39.2502 KOps/s | |
test_update__nested | 0.5359ms | 26.7570μs | 37.3735 KOps/s | 38.9710 KOps/s | |
test_set_nested | 76.6810μs | 16.0393μs | 62.3471 KOps/s | 58.9091 KOps/s | |
test_set_nested_new | 0.4246ms | 18.3624μs | 54.4591 KOps/s | 50.4716 KOps/s | |
test_select | 58.0810μs | 30.0417μs | 33.2871 KOps/s | 30.5046 KOps/s | |
test_select_nested | 81.5020μs | 44.9123μs | 22.2656 KOps/s | 22.2075 KOps/s | |
test_exclude_nested | 0.4555ms | 63.8708μs | 15.6566 KOps/s | 15.5901 KOps/s | |
test_empty[True] | 0.6992ms | 0.2999ms | 3.3339 KOps/s | 3.3280 KOps/s | |
test_empty[False] | 4.3221μs | 0.8333μs | 1.2001 MOps/s | 1.2010 MOps/s | |
test_to | 92.1810μs | 58.6080μs | 17.0625 KOps/s | 17.2986 KOps/s | |
test_to_nonblocking | 94.9820μs | 50.5814μs | 19.7701 KOps/s | 20.8906 KOps/s | |
test_unbind_speed | 0.2908ms | 0.2538ms | 3.9402 KOps/s | 4.1835 KOps/s | |
test_unbind_speed_stack0 | 0.6567ms | 0.2420ms | 4.1318 KOps/s | 4.2177 KOps/s | |
test_unbind_speed_stack1 | 94.2833ms | 0.7292ms | 1.3714 KOps/s | 1.5136 KOps/s | |
test_split | 96.1732ms | 1.6160ms | 618.8166 Ops/s | 628.9577 Ops/s | |
test_chunk | 95.6180ms | 1.6210ms | 616.8947 Ops/s | 622.0507 Ops/s | |
test_consolidate[False-None] | 3.4888ms | 2.6817ms | 372.8921 Ops/s | 330.2746 Ops/s | |
test_consolidate[default-None] | 1.7735ms | 1.6889ms | 592.1160 Ops/s | 591.4928 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.7950ms | 1.7322ms | 577.3135 Ops/s | 574.3208 Ops/s | |
test_consolidate_njt[False-None] | 6.8572ms | 6.6128ms | 151.2213 Ops/s | 108.9083 Ops/s | |
test_to[False-False-None] | 1.8462ms | 1.7648ms | 566.6245 Ops/s | 576.7769 Ops/s | |
test_to[True-False-None] | 1.6077ms | 1.3247ms | 754.8976 Ops/s | 726.8508 Ops/s | |
test_to[within-False-None] | 4.2650ms | 4.1616ms | 240.2903 Ops/s | 240.3570 Ops/s | |
test_to[True-default-None] | 5.6490ms | 5.4220ms | 184.4343 Ops/s | 189.7206 Ops/s | |
test_to_njt[False-False-None] | 7.4056ms | 7.0378ms | 142.0890 Ops/s | 143.2606 Ops/s | |
test_to_njt[True-False-None] | 5.7697ms | 5.5114ms | 181.4437 Ops/s | 177.1123 Ops/s | |
test_to_njt[within-False-None] | 12.7666ms | 12.2051ms | 81.9332 Ops/s | 80.6033 Ops/s | |
test_creation[device0] | 0.4500ms | 81.0671μs | 12.3355 KOps/s | 12.4394 KOps/s | |
test_creation_from_tensor | 0.4561ms | 85.1282μs | 11.7470 KOps/s | 11.9203 KOps/s | |
test_add_one[memmap_tensor0] | 0.4119ms | 6.9548μs | 143.7846 KOps/s | 148.5809 KOps/s | |
test_contiguous[memmap_tensor0] | 1.8896μs | 0.4250μs | 2.3529 MOps/s | 2.4358 MOps/s | |
test_stack[memmap_tensor0] | 39.2800μs | 4.6634μs | 214.4369 KOps/s | 225.5850 KOps/s | |
test_memmaptd_index | 1.4354ms | 0.2453ms | 4.0770 KOps/s | 3.8637 KOps/s | |
test_memmaptd_index_astensor | 0.4503ms | 0.3080ms | 3.2471 KOps/s | 3.0948 KOps/s | |
test_memmaptd_index_op | 0.7306ms | 0.5693ms | 1.7565 KOps/s | 1.6007 KOps/s | |
test_serialize_model | 0.4188s | 0.1725s | 5.7961 Ops/s | 7.6174 Ops/s | |
test_serialize_model_pickle | 1.3480s | 1.2145s | 0.8234 Ops/s | 0.8423 Ops/s | |
test_serialize_weights | 0.1314s | 0.1306s | 7.6560 Ops/s | 7.6311 Ops/s | |
test_serialize_weights_returnearly | 0.3182s | 54.7309ms | 18.2712 Ops/s | 11.8312 Ops/s | |
test_serialize_weights_pickle | 1.3808s | 1.2160s | 0.8224 Ops/s | 0.8216 Ops/s | |
test_reshape_pytree | 64.5010μs | 22.0321μs | 45.3884 KOps/s | 45.6382 KOps/s | |
test_reshape_td | 61.8110μs | 26.8533μs | 37.2394 KOps/s | 36.7171 KOps/s | |
test_view_pytree | 51.8410μs | 22.0873μs | 45.2750 KOps/s | 46.0049 KOps/s | |
test_view_td | 65.3710μs | 32.4242μs | 30.8411 KOps/s | 31.3263 KOps/s | |
test_unbind_pytree | 62.8110μs | 27.9719μs | 35.7502 KOps/s | 35.7781 KOps/s | |
test_unbind_td | 0.7608ms | 38.7021μs | 25.8384 KOps/s | 27.0514 KOps/s | |
test_split_pytree | 67.5610μs | 32.0571μs | 31.1943 KOps/s | 33.7170 KOps/s | |
test_split_td | 0.9712ms | 39.5031μs | 25.3144 KOps/s | 25.1146 KOps/s | |
test_add_pytree | 71.2810μs | 37.8686μs | 26.4071 KOps/s | 29.4389 KOps/s | |
test_add_td | 0.1032ms | 51.4527μs | 19.4353 KOps/s | 19.4935 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1779ms | 0.1303ms | 7.6768 KOps/s | 7.8349 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2363ms | 0.1358ms | 7.3658 KOps/s | 7.4893 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1413ms | 97.5170μs | 10.2546 KOps/s | 10.0368 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 1.3879ms | 0.1637ms | 6.1103 KOps/s | 6.8094 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 62.2410μs | 26.0079μs | 38.4499 KOps/s | 35.7494 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 61.8100μs | 30.7547μs | 32.5154 KOps/s | 33.1277 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.2831ms | 64.9643μs | 15.3931 KOps/s | 15.3138 KOps/s | |
test_compile_copy_nested[pytree-eager] | 88.4010μs | 49.1738μs | 20.3360 KOps/s | 20.3309 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1919ms | 0.1436ms | 6.9615 KOps/s | 7.1393 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3124ms | 0.2214ms | 4.5162 KOps/s | 4.6255 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1576ms | 99.0117μs | 10.0998 KOps/s | 10.2914 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1310ms | 56.7684μs | 17.6154 KOps/s | 17.9399 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1820ms | 0.1361ms | 7.3495 KOps/s | 7.3920 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5607ms | 0.5053ms | 1.9792 KOps/s | 2.1352 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4187ms | 0.2650ms | 3.7741 KOps/s | 3.8262 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.1874ms | 0.1444ms | 6.9262 KOps/s | 7.1049 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1885ms | 69.4620μs | 14.3964 KOps/s | 14.8338 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1455ms | 0.1005ms | 9.9539 KOps/s | 10.1220 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.5217ms | 0.4197ms | 2.3828 KOps/s | 2.5154 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1913ms | 0.1356ms | 7.3766 KOps/s | 7.4939 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.1317ms | 19.1304μs | 52.2727 KOps/s | 37.9328 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 75.2010μs | 31.1225μs | 32.1311 KOps/s | 31.8256 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1109ms | 70.2846μs | 14.2279 KOps/s | 14.2351 KOps/s | |
test_compile_copy_flat[pytree-eager] | 76.9110μs | 51.3646μs | 19.4687 KOps/s | 19.4899 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6135ms | 0.3881ms | 2.5769 KOps/s | 2.2254 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.9011ms | 2.7446ms | 364.3544 Ops/s | 378.0414 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.5841ms | 0.3794ms | 2.6354 KOps/s | 2.2877 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.8308ms | 2.7339ms | 365.7833 Ops/s | 386.5164 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.5344ms | 0.1147ms | 8.7203 KOps/s | 8.7173 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5583ms | 80.7067μs | 12.3905 KOps/s | 12.1110 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.6214ms | 0.1071ms | 9.3347 KOps/s | 9.3818 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1275ms | 70.5856μs | 14.1672 KOps/s | 13.9604 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1639ms | 0.1140ms | 8.7682 KOps/s | 9.0201 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1185ms | 73.3274μs | 13.6375 KOps/s | 13.7171 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1551ms | 0.1035ms | 9.6634 KOps/s | 9.9233 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1490ms | 17.7522μs | 56.3309 KOps/s | 56.2640 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1509ms | 98.0780μs | 10.1960 KOps/s | 10.4693 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 51.8110μs | 16.3016μs | 61.3436 KOps/s | 62.9009 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1470ms | 99.5338μs | 10.0468 KOps/s | 10.2215 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 46.3010μs | 16.1761μs | 61.8194 KOps/s | 62.7765 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1507ms | 0.1030ms | 9.7118 KOps/s | 9.9711 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5400ms | 17.4452μs | 57.3223 KOps/s | 57.3518 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1456ms | 96.4258μs | 10.3707 KOps/s | 10.0365 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.1232ms | 17.6364μs | 56.7008 KOps/s | 63.0791 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1480ms | 96.3818μs | 10.3754 KOps/s | 10.1852 KOps/s | |
test_compile_indexing[int-pytree-eager] | 60.0610μs | 16.1562μs | 61.8956 KOps/s | 62.5160 KOps/s | |
test_mod_add[eager] | 82.5410μs | 40.4614μs | 24.7149 KOps/s | 23.6019 KOps/s | |
test_mod_add[compile] | 0.3338ms | 84.2467μs | 11.8699 KOps/s | 11.4883 KOps/s | |
test_mod_add[compile-overhead] | 0.3281ms | 0.1684ms | 5.9378 KOps/s | 5.7013 KOps/s | |
test_mod_wrap[eager] | 0.3363ms | 0.2639ms | 3.7896 KOps/s | 3.7398 KOps/s | |
test_mod_wrap[compile] | 0.3476ms | 0.2842ms | 3.5189 KOps/s | 3.4760 KOps/s | |
test_mod_wrap[compile-overhead] | 6.3913ms | 3.5487ms | 281.7968 Ops/s | 267.3155 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.5417ms | 1.3855ms | 721.7727 Ops/s | 686.1055 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.3755ms | 1.2903ms | 775.0047 Ops/s | 725.6111 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3753ms | 0.9294ms | 1.0759 KOps/s | 913.1650 Ops/s | |
test_seq_add[eager] | 0.1668ms | 0.1148ms | 8.7145 KOps/s | 7.9629 KOps/s | |
test_seq_add[compile] | 0.1375ms | 87.7992μs | 11.3896 KOps/s | 11.2810 KOps/s | |
test_seq_add[compile-overhead] | 0.1987ms | 0.1306ms | 7.6584 KOps/s | 7.7831 KOps/s | |
test_seq_wrap[eager] | 0.5073ms | 0.4165ms | 2.4012 KOps/s | 2.3053 KOps/s | |
test_seq_wrap[compile] | 0.3524ms | 0.3009ms | 3.3232 KOps/s | 3.2372 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2782ms | 0.2300ms | 4.3485 KOps/s | 4.4078 KOps/s | |
test_func_call_runtime[False-eager] | 0.8585ms | 0.7777ms | 1.2859 KOps/s | 1.2760 KOps/s | |
test_func_call_runtime[False-compile] | 0.9930ms | 0.7567ms | 1.3216 KOps/s | 1.3354 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4345ms | 0.3632ms | 2.7535 KOps/s | 2.7541 KOps/s | |
test_func_call_runtime[True-eager] | 1.0292ms | 0.9184ms | 1.0889 KOps/s | 1.0972 KOps/s | |
test_func_call_runtime[True-compile] | 0.8363ms | 0.7746ms | 1.2910 KOps/s | 1.3035 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4370ms | 0.3899ms | 2.5648 KOps/s | 2.6089 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9359ms | 0.7816ms | 1.2794 KOps/s | 1.3468 KOps/s | |
test_func_call_cm_runtime[False-compile] | 1.1619ms | 0.7588ms | 1.3179 KOps/s | 1.3275 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4389ms | 0.3658ms | 2.7337 KOps/s | 2.7430 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.2399ms | 1.0202ms | 980.2473 Ops/s | 985.3750 Ops/s | |
test_func_call_cm_runtime[True-compile] | 1.1129ms | 0.9978ms | 1.0022 KOps/s | 1.2528 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.0639ms | 1.0029ms | 997.0991 Ops/s | 2.4310 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.6592ms | 2.1511ms | 464.8731 Ops/s | 473.5371 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.8812ms | 0.8166ms | 1.2246 KOps/s | 1.1856 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4627ms | 0.4152ms | 2.4086 KOps/s | 2.3929 KOps/s | |
test_distributed | 7.5318ms | 0.1660ms | 6.0230 KOps/s | 8.4273 KOps/s | |
test_tdmodule | 0.1726ms | 19.3394μs | 51.7080 KOps/s | 47.0643 KOps/s | |
test_tdmodule_dispatch | 54.6810μs | 33.9641μs | 29.4429 KOps/s | 25.4988 KOps/s | |
test_tdseq | 40.1510μs | 19.7950μs | 50.5178 KOps/s | 44.6915 KOps/s | |
test_tdseq_dispatch | 57.2110μs | 36.8694μs | 27.1228 KOps/s | 23.5634 KOps/s | |
test_instantiation_functorch | 2.1362ms | 1.5758ms | 634.6160 Ops/s | 632.1773 Ops/s | |
test_exec_functorch | 0.1950ms | 0.1506ms | 6.6406 KOps/s | 6.9297 KOps/s | |
test_exec_functional_call | 0.1777ms | 0.1425ms | 7.0162 KOps/s | 7.3122 KOps/s | |
test_exec_td_decorator | 0.3884ms | 0.1931ms | 5.1793 KOps/s | 5.3782 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8326ms | 0.6904ms | 1.4484 KOps/s | 1.4312 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.7943ms | 0.6899ms | 1.4494 KOps/s | 1.3797 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7118ms | 0.6042ms | 1.6550 KOps/s | 1.5912 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7193ms | 0.6034ms | 1.6573 KOps/s | 1.5941 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 20.1095ms | 19.4760ms | 51.3451 Ops/s | 50.6223 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.6442ms | 19.4804ms | 51.3337 Ops/s | 51.9325 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.4887ms | 19.3196ms | 51.7609 Ops/s | 52.4214 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 20.2889ms | 19.5398ms | 51.1776 Ops/s | 52.3325 Ops/s | |
test_to_module_speed[True] | 1.4855ms | 0.9872ms | 1.0130 KOps/s | 990.2280 Ops/s | |
test_to_module_speed[False] | 1.0803ms | 0.9743ms | 1.0264 KOps/s | 1.0195 KOps/s | |
test_tc_init | 76.0920μs | 35.7488μs | 27.9730 KOps/s | 25.0235 KOps/s | |
test_tc_init_nested | 0.1679ms | 71.4137μs | 14.0029 KOps/s | 12.5426 KOps/s | |
test_tc_first_layer_tensor | 28.1400μs | 0.8280μs | 1.2078 MOps/s | 1.1541 MOps/s | |
test_tc_first_layer_nontensor | 36.7300μs | 2.2900μs | 436.6721 KOps/s | 421.8379 KOps/s | |
test_tc_second_layer_tensor | 26.5303μs | 1.4588μs | 685.4915 KOps/s | 678.5825 KOps/s | |
test_tc_second_layer_nontensor | 61.9210μs | 3.0080μs | 332.4509 KOps/s | 320.1565 KOps/s | |
test_unbind | 7.2559ms | 7.0157ms | 142.5379 Ops/s | 141.8157 Ops/s | |
test_full_like | 12.9426ms | 9.1721ms | 109.0261 Ops/s | 109.4730 Ops/s | |
test_zeros_like | 5.9419ms | 4.2705ms | 234.1623 Ops/s | 114.5373 Ops/s | |
test_ones_like | 4.4646ms | 4.2364ms | 236.0516 Ops/s | 241.4820 Ops/s | |
test_clone | 11.2332ms | 9.0777ms | 110.1596 Ops/s | 157.9296 Ops/s | |
test_squeeze | 47.6200μs | 9.5143μs | 105.1045 KOps/s | 103.6631 KOps/s | |
test_unsqueeze | 0.1234ms | 74.3705μs | 13.4462 KOps/s | 13.9544 KOps/s | |
test_split | 0.2084s | 0.2217ms | 4.5099 KOps/s | 6.0607 KOps/s | |
test_permute | 0.2369ms | 0.1878ms | 5.3252 KOps/s | 5.6699 KOps/s | |
test_stack | 53.1031ms | 50.2750ms | 19.8906 Ops/s | 19.6310 Ops/s | |
test_cat | 50.4501ms | 50.0801ms | 19.9680 Ops/s | 19.8950 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):