-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] flexible return type when indexing prob sequences #1189
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Jan 21, 2025
ghstack-source-id: 74d28ee84d965c11c527c60b20d9123ef30007f6 Pull Request resolved: #1189
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jan 21, 2025
vmoens
added a commit
that referenced
this pull request
Jan 21, 2025
ghstack-source-id: 74d28ee84d965c11c527c60b20d9123ef30007f6 Pull Request resolved: #1189
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 34.2710μs | 11.4245μs | 87.5314 KOps/s | 74.0584 KOps/s | |
test_plain_set_stack_nested | 34.1510μs | 11.6590μs | 85.7703 KOps/s | 72.6472 KOps/s | |
test_plain_set_nested_inplace | 52.6800μs | 12.5686μs | 79.5632 KOps/s | 69.0652 KOps/s | |
test_plain_set_stack_nested_inplace | 43.5010μs | 12.5399μs | 79.7454 KOps/s | 68.9896 KOps/s | |
test_items | 21.1400μs | 2.9149μs | 343.0705 KOps/s | 340.7535 KOps/s | |
test_items_nested | 0.4158ms | 0.3618ms | 2.7641 KOps/s | 2.7452 KOps/s | |
test_items_nested_locked | 0.5574ms | 0.3688ms | 2.7116 KOps/s | 2.7397 KOps/s | |
test_items_nested_leaf | 84.3020μs | 58.5174μs | 17.0889 KOps/s | 17.0415 KOps/s | |
test_items_stack_nested | 0.4154ms | 0.3646ms | 2.7425 KOps/s | 2.7356 KOps/s | |
test_items_stack_nested_leaf | 90.6520μs | 59.9455μs | 16.6818 KOps/s | 16.6472 KOps/s | |
test_items_stack_nested_locked | 0.3920ms | 0.3659ms | 2.7333 KOps/s | 2.7265 KOps/s | |
test_keys | 30.0910μs | 3.4713μs | 288.0767 KOps/s | 288.4604 KOps/s | |
test_keys_nested | 0.1269ms | 87.8884μs | 11.3781 KOps/s | 11.4683 KOps/s | |
test_keys_nested_locked | 0.7249ms | 93.9821μs | 10.6403 KOps/s | 10.7784 KOps/s | |
test_keys_nested_leaf | 0.1212ms | 78.4432μs | 12.7481 KOps/s | 12.8161 KOps/s | |
test_keys_stack_nested | 0.1513ms | 87.8316μs | 11.3854 KOps/s | 11.3594 KOps/s | |
test_keys_stack_nested_leaf | 0.1176ms | 78.8555μs | 12.6814 KOps/s | 12.6611 KOps/s | |
test_keys_stack_nested_locked | 0.1291ms | 93.1290μs | 10.7378 KOps/s | 10.6756 KOps/s | |
test_values | 5.8633μs | 0.8541μs | 1.1708 MOps/s | 1.1784 MOps/s | |
test_values_nested | 67.2910μs | 37.9280μs | 26.3658 KOps/s | 26.9034 KOps/s | |
test_values_nested_locked | 65.8210μs | 39.1954μs | 25.5132 KOps/s | 25.8360 KOps/s | |
test_values_nested_leaf | 88.8920μs | 41.7198μs | 23.9694 KOps/s | 23.9851 KOps/s | |
test_values_stack_nested | 0.1086ms | 38.5644μs | 25.9307 KOps/s | 26.4775 KOps/s | |
test_values_stack_nested_leaf | 75.6710μs | 42.3412μs | 23.6176 KOps/s | 23.5387 KOps/s | |
test_values_stack_nested_locked | 87.4610μs | 39.8939μs | 25.0665 KOps/s | 25.4017 KOps/s | |
test_membership | 2.6170μs | 0.5002μs | 1.9993 MOps/s | 1.9568 MOps/s | |
test_membership_nested | 16.1300μs | 1.9998μs | 500.0458 KOps/s | 493.2413 KOps/s | |
test_membership_nested_leaf | 17.4300μs | 2.0234μs | 494.2097 KOps/s | 493.2794 KOps/s | |
test_membership_stacked_nested | 29.6400μs | 2.1269μs | 470.1635 KOps/s | 481.1856 KOps/s | |
test_membership_stacked_nested_leaf | 32.3010μs | 2.0879μs | 478.9574 KOps/s | 481.1007 KOps/s | |
test_membership_nested_last | 36.4510μs | 3.0844μs | 324.2097 KOps/s | 317.5868 KOps/s | |
test_membership_nested_leaf_last | 39.7400μs | 3.0835μs | 324.3050 KOps/s | 315.8241 KOps/s | |
test_membership_stacked_nested_last | 54.7310μs | 8.1967μs | 122.0005 KOps/s | 243.0257 KOps/s | |
test_membership_stacked_nested_leaf_last | 35.0000μs | 8.2478μs | 121.2447 KOps/s | 242.1743 KOps/s | |
test_nested_getleaf | 37.0610μs | 6.0865μs | 164.2979 KOps/s | 162.2235 KOps/s | |
test_nested_get | 34.6510μs | 5.8415μs | 171.1877 KOps/s | 171.2673 KOps/s | |
test_stacked_getleaf | 38.9610μs | 6.1337μs | 163.0344 KOps/s | 163.8438 KOps/s | |
test_stacked_get | 33.1410μs | 5.8095μs | 172.1327 KOps/s | 173.0299 KOps/s | |
test_nested_getitemleaf | 32.4410μs | 6.4002μs | 156.2440 KOps/s | 155.5115 KOps/s | |
test_nested_getitem | 32.9410μs | 6.1759μs | 161.9202 KOps/s | 162.9436 KOps/s | |
test_stacked_getitemleaf | 37.5010μs | 6.4437μs | 155.1910 KOps/s | 155.5388 KOps/s | |
test_stacked_getitem | 27.5900μs | 6.0949μs | 164.0707 KOps/s | 163.9280 KOps/s | |
test_lock_nested | 8.8157ms | 0.3514ms | 2.8455 KOps/s | 2.8012 KOps/s | |
test_lock_stack_nested | 0.3911ms | 0.3377ms | 2.9616 KOps/s | 2.8578 KOps/s | |
test_unlock_nested | 0.3503ms | 0.2818ms | 3.5485 KOps/s | 3.4481 KOps/s | |
test_unlock_stack_nested | 0.3184ms | 0.2753ms | 3.6321 KOps/s | 3.4994 KOps/s | |
test_flatten_speed | 0.1247ms | 75.5420μs | 13.2377 KOps/s | 13.1493 KOps/s | |
test_unflatten_speed | 0.4502ms | 0.3225ms | 3.1012 KOps/s | 3.0708 KOps/s | |
test_common_ops | 0.7388ms | 0.5941ms | 1.6832 KOps/s | 1.4615 KOps/s | |
test_creation | 0.1003ms | 1.7405μs | 574.5598 KOps/s | 573.9256 KOps/s | |
test_creation_empty | 32.3810μs | 6.9871μs | 143.1211 KOps/s | 92.6450 KOps/s | |
test_creation_nested_1 | 42.1000μs | 8.7205μs | 114.6720 KOps/s | 79.6560 KOps/s | |
test_creation_nested_2 | 33.5000μs | 11.5260μs | 86.7602 KOps/s | 65.6566 KOps/s | |
test_clone | 45.3510μs | 10.7415μs | 93.0971 KOps/s | 87.7606 KOps/s | |
test_getitem[int] | 1.1478ms | 10.8619μs | 92.0647 KOps/s | 89.6719 KOps/s | |
test_getitem[slice_int] | 0.1064ms | 21.0626μs | 47.4775 KOps/s | 45.6785 KOps/s | |
test_getitem[range] | 0.1245ms | 38.8388μs | 25.7474 KOps/s | 25.0841 KOps/s | |
test_getitem[tuple] | 0.1067ms | 18.6403μs | 53.6473 KOps/s | 52.5165 KOps/s | |
test_getitem[list] | 0.1315ms | 33.2375μs | 30.0865 KOps/s | 28.2871 KOps/s | |
test_setitem_dim[int] | 41.2010μs | 19.8500μs | 50.3779 KOps/s | 47.5882 KOps/s | |
test_setitem_dim[slice_int] | 60.5610μs | 39.3330μs | 25.4239 KOps/s | 25.0482 KOps/s | |
test_setitem_dim[range] | 88.6320μs | 54.8144μs | 18.2434 KOps/s | 18.0270 KOps/s | |
test_setitem_dim[tuple] | 61.3010μs | 33.2495μs | 30.0756 KOps/s | 29.2012 KOps/s | |
test_setitem | 49.0810μs | 14.7614μs | 67.7442 KOps/s | 58.2152 KOps/s | |
test_set | 55.5510μs | 14.2883μs | 69.9874 KOps/s | 58.8965 KOps/s | |
test_set_shared | 0.5076ms | 0.1621ms | 6.1709 KOps/s | 6.1777 KOps/s | |
test_update | 0.3693ms | 16.3153μs | 61.2921 KOps/s | 47.6036 KOps/s | |
test_update_nested | 49.5710μs | 21.4561μs | 46.6068 KOps/s | 37.5994 KOps/s | |
test_update__nested | 0.4961ms | 25.8945μs | 38.6183 KOps/s | 36.6641 KOps/s | |
test_set_nested | 64.3110μs | 15.3908μs | 64.9738 KOps/s | 55.0176 KOps/s | |
test_set_nested_new | 52.2500μs | 17.6452μs | 56.6728 KOps/s | 48.5175 KOps/s | |
test_select | 65.5910μs | 29.3658μs | 34.0532 KOps/s | 30.6089 KOps/s | |
test_select_nested | 88.5810μs | 44.5962μs | 22.4234 KOps/s | 22.6758 KOps/s | |
test_exclude_nested | 0.1064ms | 64.3675μs | 15.5358 KOps/s | 15.6699 KOps/s | |
test_empty[True] | 0.3448ms | 0.2949ms | 3.3905 KOps/s | 3.3825 KOps/s | |
test_empty[False] | 4.3211μs | 0.8316μs | 1.2024 MOps/s | 1.2063 MOps/s | |
test_to | 84.9410μs | 56.8937μs | 17.5766 KOps/s | 17.5322 KOps/s | |
test_to_nonblocking | 88.6210μs | 49.0766μs | 20.3763 KOps/s | 20.4592 KOps/s | |
test_unbind_speed | 0.2666ms | 0.2413ms | 4.1448 KOps/s | 4.0604 KOps/s | |
test_unbind_speed_stack0 | 0.2771ms | 0.2352ms | 4.2509 KOps/s | 4.0693 KOps/s | |
test_unbind_speed_stack1 | 92.3105ms | 0.7264ms | 1.3766 KOps/s | 1.3700 KOps/s | |
test_split | 93.4549ms | 1.5934ms | 627.6045 Ops/s | 611.5255 Ops/s | |
test_chunk | 95.8648ms | 1.6076ms | 622.0548 Ops/s | 606.7582 Ops/s | |
test_consolidate[False-None] | 3.4107ms | 2.7565ms | 362.7823 Ops/s | 363.8128 Ops/s | |
test_consolidate[default-None] | 1.8169ms | 1.7404ms | 574.5760 Ops/s | 579.7707 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.8297ms | 1.7605ms | 568.0225 Ops/s | 560.1837 Ops/s | |
test_consolidate_njt[False-None] | 7.1986ms | 6.9121ms | 144.6743 Ops/s | 145.6552 Ops/s | |
test_to[False-False-None] | 1.8789ms | 1.7674ms | 565.7933 Ops/s | 559.1862 Ops/s | |
test_to[True-False-None] | 1.7290ms | 1.4065ms | 710.9699 Ops/s | 705.5864 Ops/s | |
test_to[within-False-None] | 4.3574ms | 4.1823ms | 239.1013 Ops/s | 229.9538 Ops/s | |
test_to[True-default-None] | 5.9665ms | 5.5067ms | 181.5984 Ops/s | 185.4693 Ops/s | |
test_to_njt[False-False-None] | 7.1762ms | 7.0643ms | 141.5559 Ops/s | 139.1086 Ops/s | |
test_to_njt[True-False-None] | 6.0786ms | 5.6110ms | 178.2222 Ops/s | 170.5579 Ops/s | |
test_to_njt[within-False-None] | 12.4929ms | 12.3914ms | 80.7011 Ops/s | 78.3343 Ops/s | |
test_creation[device0] | 0.2837ms | 85.5903μs | 11.6836 KOps/s | 12.1382 KOps/s | |
test_creation_from_tensor | 0.5462ms | 89.4112μs | 11.1843 KOps/s | 11.2308 KOps/s | |
test_add_one[memmap_tensor0] | 0.5776ms | 6.9077μs | 144.7661 KOps/s | 136.3970 KOps/s | |
test_contiguous[memmap_tensor0] | 2.4685μs | 0.4203μs | 2.3792 MOps/s | 2.3489 MOps/s | |
test_stack[memmap_tensor0] | 36.3310μs | 4.4449μs | 224.9782 KOps/s | 215.1177 KOps/s | |
test_memmaptd_index | 1.4640ms | 0.2446ms | 4.0883 KOps/s | 3.9479 KOps/s | |
test_memmaptd_index_astensor | 0.4545ms | 0.3081ms | 3.2458 KOps/s | 3.1869 KOps/s | |
test_memmaptd_index_op | 0.7049ms | 0.5610ms | 1.7826 KOps/s | 1.5404 KOps/s | |
test_serialize_model | 0.4291s | 0.1731s | 5.7778 Ops/s | 7.6324 Ops/s | |
test_serialize_model_pickle | 1.3843s | 1.2157s | 0.8226 Ops/s | 0.8248 Ops/s | |
test_serialize_weights | 0.1321s | 0.1304s | 7.6684 Ops/s | 7.6814 Ops/s | |
test_serialize_weights_returnearly | 0.3295s | 54.9902ms | 18.1851 Ops/s | 14.9987 Ops/s | |
test_serialize_weights_pickle | 2.9377s | 1.7565s | 0.5693 Ops/s | 0.8254 Ops/s | |
test_reshape_pytree | 52.5400μs | 22.4919μs | 44.4604 KOps/s | 44.0636 KOps/s | |
test_reshape_td | 69.6310μs | 27.8680μs | 35.8835 KOps/s | 35.4466 KOps/s | |
test_view_pytree | 61.4110μs | 22.8158μs | 43.8294 KOps/s | 44.8365 KOps/s | |
test_view_td | 70.4810μs | 32.9571μs | 30.3425 KOps/s | 28.7999 KOps/s | |
test_unbind_pytree | 54.3700μs | 28.3846μs | 35.2304 KOps/s | 34.7553 KOps/s | |
test_unbind_td | 0.8051ms | 37.2263μs | 26.8627 KOps/s | 26.5785 KOps/s | |
test_split_pytree | 58.2610μs | 30.8017μs | 32.4658 KOps/s | 31.8726 KOps/s | |
test_split_td | 0.6490ms | 39.1240μs | 25.5598 KOps/s | 25.0535 KOps/s | |
test_add_pytree | 76.9410μs | 35.3173μs | 28.3147 KOps/s | 27.0926 KOps/s | |
test_add_td | 87.8320μs | 47.5066μs | 21.0497 KOps/s | 18.4998 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1787ms | 0.1253ms | 7.9838 KOps/s | 7.5456 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2222ms | 0.1324ms | 7.5555 KOps/s | 7.4313 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1449ms | 97.5140μs | 10.2549 KOps/s | 10.0130 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 1.3615ms | 0.1485ms | 6.7329 KOps/s | 6.3580 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 65.3110μs | 25.0195μs | 39.9688 KOps/s | 42.6738 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 61.2510μs | 29.6712μs | 33.7027 KOps/s | 33.8460 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.3915ms | 66.5363μs | 15.0294 KOps/s | 15.2793 KOps/s | |
test_compile_copy_nested[pytree-eager] | 78.4210μs | 49.2607μs | 20.3002 KOps/s | 20.6826 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1839ms | 0.1431ms | 6.9857 KOps/s | 7.0047 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3195ms | 0.2163ms | 4.6229 KOps/s | 4.6126 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1610ms | 98.9461μs | 10.1065 KOps/s | 10.0614 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1175ms | 55.8030μs | 17.9202 KOps/s | 16.5397 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2075ms | 0.1358ms | 7.3646 KOps/s | 7.3332 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5297ms | 0.4787ms | 2.0891 KOps/s | 1.9659 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3935ms | 0.2606ms | 3.8368 KOps/s | 3.8060 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.1829ms | 0.1450ms | 6.8960 KOps/s | 6.8498 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1570ms | 70.6810μs | 14.1481 KOps/s | 13.6654 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1487ms | 0.1012ms | 9.8856 KOps/s | 9.6094 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4548ms | 0.4090ms | 2.4448 KOps/s | 2.3568 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1745ms | 0.1368ms | 7.3086 KOps/s | 7.2717 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 58.4810μs | 19.5170μs | 51.2375 KOps/s | 55.2720 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 69.5110μs | 30.6016μs | 32.6781 KOps/s | 32.1756 KOps/s | |
test_compile_copy_flat[pytree-compile] | 96.8810μs | 70.3945μs | 14.2056 KOps/s | 14.2803 KOps/s | |
test_compile_copy_flat[pytree-eager] | 87.9210μs | 51.3182μs | 19.4863 KOps/s | 19.3500 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6100ms | 0.3899ms | 2.5647 KOps/s | 2.1873 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.8665ms | 2.6705ms | 374.4641 Ops/s | 364.4989 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.5841ms | 0.4323ms | 2.3131 KOps/s | 2.2295 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.7823ms | 2.6738ms | 374.0030 Ops/s | 359.1933 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1733ms | 0.1153ms | 8.6756 KOps/s | 8.6053 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5594ms | 79.8443μs | 12.5244 KOps/s | 11.7084 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.2200ms | 0.1128ms | 8.8624 KOps/s | 8.5911 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1403ms | 72.4982μs | 13.7935 KOps/s | 14.0748 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1661ms | 0.1136ms | 8.8046 KOps/s | 8.8039 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1438ms | 72.3465μs | 13.8224 KOps/s | 14.2002 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1561ms | 0.1063ms | 9.4094 KOps/s | 9.7431 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1473ms | 19.5044μs | 51.2705 KOps/s | 53.2010 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1504ms | 0.1022ms | 9.7856 KOps/s | 10.1775 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 70.9010μs | 17.0202μs | 58.7536 KOps/s | 59.6066 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1779ms | 0.1026ms | 9.7508 KOps/s | 10.1511 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 93.6710μs | 17.0603μs | 58.6156 KOps/s | 59.9884 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1794ms | 0.1072ms | 9.3297 KOps/s | 9.7285 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5578ms | 19.0980μs | 52.3615 KOps/s | 53.8502 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1942ms | 99.3713μs | 10.0633 KOps/s | 10.1648 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 59.8210μs | 17.1065μs | 58.4575 KOps/s | 60.3485 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1543ms | 0.1027ms | 9.7403 KOps/s | 10.1657 KOps/s | |
test_compile_indexing[int-pytree-eager] | 90.3710μs | 16.6330μs | 60.1215 KOps/s | 60.0080 KOps/s | |
test_mod_add[eager] | 92.6510μs | 37.7048μs | 26.5218 KOps/s | 23.0083 KOps/s | |
test_mod_add[compile] | 0.1140ms | 82.0765μs | 12.1838 KOps/s | 12.0710 KOps/s | |
test_mod_add[compile-overhead] | 0.3270ms | 0.1680ms | 5.9529 KOps/s | 5.6156 KOps/s | |
test_mod_wrap[eager] | 0.3288ms | 0.2578ms | 3.8785 KOps/s | 3.7979 KOps/s | |
test_mod_wrap[compile] | 0.3590ms | 0.2877ms | 3.4755 KOps/s | 3.3833 KOps/s | |
test_mod_wrap[compile-overhead] | 6.4095ms | 3.5546ms | 281.3288 Ops/s | 272.9331 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.4979ms | 1.3573ms | 736.7450 Ops/s | 661.1178 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.3621ms | 1.2759ms | 783.7872 Ops/s | 711.4294 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3861ms | 0.9246ms | 1.0815 KOps/s | 955.2525 Ops/s | |
test_seq_add[eager] | 0.1605ms | 0.1149ms | 8.7036 KOps/s | 8.1337 KOps/s | |
test_seq_add[compile] | 0.1230ms | 89.0996μs | 11.2234 KOps/s | 11.0938 KOps/s | |
test_seq_add[compile-overhead] | 0.1820ms | 0.1287ms | 7.7673 KOps/s | 7.5578 KOps/s | |
test_seq_wrap[eager] | 0.4919ms | 0.4180ms | 2.3923 KOps/s | 2.2376 KOps/s | |
test_seq_wrap[compile] | 0.3453ms | 0.3041ms | 3.2887 KOps/s | 3.0459 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2735ms | 0.2262ms | 4.4216 KOps/s | 4.3617 KOps/s | |
test_func_call_runtime[False-eager] | 0.8537ms | 0.7445ms | 1.3432 KOps/s | 1.2971 KOps/s | |
test_func_call_runtime[False-compile] | 0.8185ms | 0.7536ms | 1.3270 KOps/s | 1.2962 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4110ms | 0.3669ms | 2.7256 KOps/s | 2.7034 KOps/s | |
test_func_call_runtime[True-eager] | 1.0553ms | 0.9103ms | 1.0985 KOps/s | 1.0683 KOps/s | |
test_func_call_runtime[True-compile] | 0.8752ms | 0.7917ms | 1.2631 KOps/s | 1.2742 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4365ms | 0.3854ms | 2.5945 KOps/s | 2.5570 KOps/s | |
test_func_call_cm_runtime[False-eager] | 1.1580ms | 0.7418ms | 1.3481 KOps/s | 1.3048 KOps/s | |
test_func_call_cm_runtime[False-compile] | 1.1744ms | 0.7562ms | 1.3224 KOps/s | 1.2963 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4353ms | 0.3705ms | 2.6991 KOps/s | 2.6738 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.4199ms | 1.0179ms | 982.3745 Ops/s | 959.6950 Ops/s | |
test_func_call_cm_runtime[True-compile] | 1.4144ms | 1.0267ms | 973.9678 Ops/s | 919.1223 Ops/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.1355ms | 0.9974ms | 1.0026 KOps/s | 976.0718 Ops/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5527ms | 2.1151ms | 472.7818 Ops/s | 464.7163 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9353ms | 0.8241ms | 1.2135 KOps/s | 1.1900 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.8349ms | 0.4172ms | 2.3972 KOps/s | 2.3501 KOps/s | |
test_distributed | 3.0278ms | 0.2085ms | 4.7958 KOps/s | 8.4933 KOps/s | |
test_tdmodule | 79.4310μs | 19.4327μs | 51.4597 KOps/s | 46.8217 KOps/s | |
test_tdmodule_dispatch | 0.3019ms | 34.9475μs | 28.6143 KOps/s | 25.8064 KOps/s | |
test_tdseq | 41.9310μs | 19.9860μs | 50.0349 KOps/s | 43.9210 KOps/s | |
test_tdseq_dispatch | 61.1110μs | 36.8402μs | 27.1442 KOps/s | 23.8873 KOps/s | |
test_instantiation_functorch | 1.6883ms | 1.5761ms | 634.4784 Ops/s | 623.3938 Ops/s | |
test_exec_functorch | 0.1806ms | 0.1460ms | 6.8476 KOps/s | 6.6980 KOps/s | |
test_exec_functional_call | 0.1840ms | 0.1400ms | 7.1426 KOps/s | 6.9520 KOps/s | |
test_exec_td_decorator | 0.3888ms | 0.1899ms | 5.2660 KOps/s | 5.2187 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8112ms | 0.6874ms | 1.4548 KOps/s | 1.4223 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8274ms | 0.6893ms | 1.4508 KOps/s | 1.3903 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7236ms | 0.6025ms | 1.6598 KOps/s | 1.5701 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7337ms | 0.6044ms | 1.6546 KOps/s | 1.6046 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.9721ms | 19.3445ms | 51.6944 Ops/s | 51.1123 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.4172ms | 19.3114ms | 51.7828 Ops/s | 51.0983 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.9622ms | 19.6400ms | 50.9164 Ops/s | 51.1447 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.9242ms | 19.6367ms | 50.9250 Ops/s | 51.6029 Ops/s | |
test_to_module_speed[True] | 1.4901ms | 0.9831ms | 1.0172 KOps/s | 1.0390 KOps/s | |
test_to_module_speed[False] | 1.0384ms | 0.9650ms | 1.0362 KOps/s | 1.0561 KOps/s | |
test_tc_init | 72.6510μs | 35.2020μs | 28.4075 KOps/s | 24.8831 KOps/s | |
test_tc_init_nested | 0.1033ms | 68.8254μs | 14.5295 KOps/s | 12.4574 KOps/s | |
test_tc_first_layer_tensor | 30.5610μs | 0.8179μs | 1.2227 MOps/s | 1.3976 MOps/s | |
test_tc_first_layer_nontensor | 27.9510μs | 2.2619μs | 442.1042 KOps/s | 445.0371 KOps/s | |
test_tc_second_layer_tensor | 8.5953μs | 1.4425μs | 693.2215 KOps/s | 694.6681 KOps/s | |
test_tc_second_layer_nontensor | 33.5510μs | 2.9688μs | 336.8318 KOps/s | 329.7965 KOps/s | |
test_unbind | 0.2215s | 12.2836ms | 81.4091 Ops/s | 140.7518 Ops/s | |
test_full_like | 9.5256ms | 9.2295ms | 108.3482 Ops/s | 108.1443 Ops/s | |
test_zeros_like | 9.1963ms | 7.2763ms | 137.4317 Ops/s | 231.2600 Ops/s | |
test_ones_like | 4.9693ms | 4.3335ms | 230.7620 Ops/s | 230.4945 Ops/s | |
test_clone | 11.6548ms | 9.2550ms | 108.0494 Ops/s | 153.7866 Ops/s | |
test_squeeze | 67.1310μs | 9.9210μs | 100.7967 KOps/s | 89.8484 KOps/s | |
test_unsqueeze | 0.1237ms | 74.3121μs | 13.4568 KOps/s | 12.8219 KOps/s | |
test_split | 0.3825ms | 0.1628ms | 6.1424 KOps/s | 6.0134 KOps/s | |
test_permute | 0.2290ms | 0.1780ms | 5.6188 KOps/s | 5.2783 KOps/s | |
test_stack | 51.3315ms | 50.7410ms | 19.7079 Ops/s | 19.5761 Ops/s | |
test_cat | 51.3484ms | 50.7894ms | 19.6892 Ops/s | 19.7650 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):