-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] fix inline TDParams kwargs for nontensordata #1094
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Nov 20, 2024
ghstack-source-id: afd50385b6b1e8bd8ccfaabfa387ca5611ca07e2 Pull Request resolved: #1094
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Nov 20, 2024
vmoens
added a commit
that referenced
this pull request
Nov 20, 2024
ghstack-source-id: afd50385b6b1e8bd8ccfaabfa387ca5611ca07e2 Pull Request resolved: #1094
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 37.0090μs | 17.7520μs | 56.3317 KOps/s | 56.6728 KOps/s | |
test_plain_set_stack_nested | 38.1610μs | 17.9027μs | 55.8576 KOps/s | 56.0605 KOps/s | |
test_plain_set_nested_inplace | 53.9710μs | 19.6773μs | 50.8199 KOps/s | 51.4315 KOps/s | |
test_plain_set_stack_nested_inplace | 46.7370μs | 19.5239μs | 51.2194 KOps/s | 51.2376 KOps/s | |
test_items | 27.7320μs | 4.1555μs | 240.6442 KOps/s | 242.0542 KOps/s | |
test_items_nested | 0.6553ms | 0.3462ms | 2.8887 KOps/s | 2.9366 KOps/s | |
test_items_nested_locked | 0.7224ms | 0.3424ms | 2.9207 KOps/s | 2.9353 KOps/s | |
test_items_nested_leaf | 0.1336ms | 72.9782μs | 13.7027 KOps/s | 13.9955 KOps/s | |
test_items_stack_nested | 0.6515ms | 0.3451ms | 2.8975 KOps/s | 2.8984 KOps/s | |
test_items_stack_nested_leaf | 0.1348ms | 75.1140μs | 13.3131 KOps/s | 13.4566 KOps/s | |
test_items_stack_nested_locked | 0.7787ms | 0.3445ms | 2.9028 KOps/s | 2.9139 KOps/s | |
test_keys | 25.8590μs | 3.4889μs | 286.6248 KOps/s | 286.4449 KOps/s | |
test_keys_nested | 0.2405ms | 0.1383ms | 7.2301 KOps/s | 7.2799 KOps/s | |
test_keys_nested_locked | 1.7114ms | 0.1451ms | 6.8909 KOps/s | 7.0037 KOps/s | |
test_keys_nested_leaf | 0.2110ms | 0.1196ms | 8.3581 KOps/s | 8.5132 KOps/s | |
test_keys_stack_nested | 0.2597ms | 0.1379ms | 7.2491 KOps/s | 7.2232 KOps/s | |
test_keys_stack_nested_leaf | 0.1871ms | 0.1185ms | 8.4413 KOps/s | 8.5027 KOps/s | |
test_keys_stack_nested_locked | 0.2597ms | 0.1438ms | 6.9523 KOps/s | 7.0041 KOps/s | |
test_values | 6.3798μs | 1.0263μs | 974.3459 KOps/s | 937.3199 KOps/s | |
test_values_nested | 0.1111ms | 55.4638μs | 18.0298 KOps/s | 17.8859 KOps/s | |
test_values_nested_locked | 0.1047ms | 55.2826μs | 18.0889 KOps/s | 17.8541 KOps/s | |
test_values_nested_leaf | 0.1169ms | 60.7958μs | 16.4485 KOps/s | 16.4375 KOps/s | |
test_values_stack_nested | 0.1032ms | 56.4163μs | 17.7254 KOps/s | 17.3877 KOps/s | |
test_values_stack_nested_leaf | 0.1052ms | 60.1247μs | 16.6321 KOps/s | 15.4188 KOps/s | |
test_values_stack_nested_locked | 0.1032ms | 56.8328μs | 17.5955 KOps/s | 17.4397 KOps/s | |
test_membership | 4.1249μs | 0.7183μs | 1.3921 MOps/s | 1.1267 MOps/s | |
test_membership_nested | 34.3040μs | 2.7870μs | 358.8136 KOps/s | 371.3407 KOps/s | |
test_membership_nested_leaf | 30.5280μs | 2.7878μs | 358.7104 KOps/s | 370.0529 KOps/s | |
test_membership_stacked_nested | 24.5560μs | 2.7432μs | 364.5338 KOps/s | 364.6517 KOps/s | |
test_membership_stacked_nested_leaf | 24.5460μs | 2.7414μs | 364.7750 KOps/s | 363.4391 KOps/s | |
test_membership_nested_last | 32.3000μs | 4.1245μs | 242.4516 KOps/s | 243.6994 KOps/s | |
test_membership_nested_leaf_last | 27.0400μs | 4.1569μs | 240.5625 KOps/s | 242.2147 KOps/s | |
test_membership_stacked_nested_last | 39.4130μs | 12.8748μs | 77.6712 KOps/s | 245.5783 KOps/s | |
test_membership_stacked_nested_leaf_last | 34.9860μs | 13.0322μs | 76.7331 KOps/s | 243.7735 KOps/s | |
test_nested_getleaf | 34.0240μs | 10.6295μs | 94.0779 KOps/s | 93.8108 KOps/s | |
test_nested_get | 36.8090μs | 10.0393μs | 99.6089 KOps/s | 99.4074 KOps/s | |
test_stacked_getleaf | 35.4960μs | 10.3345μs | 96.7631 KOps/s | 95.3637 KOps/s | |
test_stacked_get | 33.4420μs | 9.9298μs | 100.7069 KOps/s | 101.0526 KOps/s | |
test_nested_getitemleaf | 34.8650μs | 11.1168μs | 89.9543 KOps/s | 90.9887 KOps/s | |
test_nested_getitem | 35.5160μs | 10.3914μs | 96.2334 KOps/s | 97.6253 KOps/s | |
test_stacked_getitemleaf | 48.3310μs | 10.9273μs | 91.5141 KOps/s | 92.5101 KOps/s | |
test_stacked_getitem | 44.2520μs | 10.1602μs | 98.4231 KOps/s | 97.6442 KOps/s | |
test_lock_nested | 2.9651ms | 0.4515ms | 2.2147 KOps/s | 1.8672 KOps/s | |
test_lock_stack_nested | 0.6109ms | 0.4035ms | 2.4781 KOps/s | 2.4462 KOps/s | |
test_unlock_nested | 0.6626ms | 0.3614ms | 2.7670 KOps/s | 2.7622 KOps/s | |
test_unlock_stack_nested | 0.4618ms | 0.3213ms | 3.1120 KOps/s | 3.0575 KOps/s | |
test_flatten_speed | 0.1635ms | 94.4022μs | 10.5930 KOps/s | 10.8739 KOps/s | |
test_unflatten_speed | 0.8478ms | 0.4754ms | 2.1036 KOps/s | 2.1476 KOps/s | |
test_common_ops | 1.4862ms | 0.7544ms | 1.3256 KOps/s | 1.3152 KOps/s | |
test_creation | 0.3065ms | 2.2547μs | 443.5157 KOps/s | 487.0012 KOps/s | |
test_creation_empty | 37.4900μs | 10.5392μs | 94.8836 KOps/s | 94.2695 KOps/s | |
test_creation_nested_1 | 52.1880μs | 13.2683μs | 75.3674 KOps/s | 73.4275 KOps/s | |
test_creation_nested_2 | 56.6150μs | 17.4336μs | 57.3605 KOps/s | 57.6704 KOps/s | |
test_clone | 61.2050μs | 13.5815μs | 73.6298 KOps/s | 74.6728 KOps/s | |
test_getitem[int] | 1.4190ms | 13.0664μs | 76.5319 KOps/s | 79.5977 KOps/s | |
test_getitem[slice_int] | 0.1464ms | 25.9475μs | 38.5394 KOps/s | 42.3380 KOps/s | |
test_getitem[range] | 0.1675ms | 48.4966μs | 20.6200 KOps/s | 20.9092 KOps/s | |
test_getitem[tuple] | 0.1323ms | 20.4365μs | 48.9320 KOps/s | 51.1198 KOps/s | |
test_getitem[list] | 0.1772ms | 43.4034μs | 23.0397 KOps/s | 23.5433 KOps/s | |
test_setitem_dim[int] | 46.9580μs | 26.3973μs | 37.8826 KOps/s | 40.9015 KOps/s | |
test_setitem_dim[slice_int] | 91.4800μs | 52.2595μs | 19.1353 KOps/s | 20.1846 KOps/s | |
test_setitem_dim[range] | 0.1238ms | 74.3541μs | 13.4492 KOps/s | 13.6913 KOps/s | |
test_setitem_dim[tuple] | 61.8760μs | 41.7388μs | 23.9585 KOps/s | 25.4084 KOps/s | |
test_setitem | 64.1910μs | 20.8273μs | 48.0140 KOps/s | 50.2506 KOps/s | |
test_set | 79.0780μs | 20.3530μs | 49.1329 KOps/s | 51.5550 KOps/s | |
test_set_shared | 1.1704ms | 0.1658ms | 6.0300 KOps/s | 5.9534 KOps/s | |
test_update | 0.1717ms | 22.7938μs | 43.8715 KOps/s | 46.0754 KOps/s | |
test_update_nested | 0.1341ms | 33.0968μs | 30.2144 KOps/s | 31.9383 KOps/s | |
test_update__nested | 0.5145ms | 34.0952μs | 29.3296 KOps/s | 31.5758 KOps/s | |
test_set_nested | 78.3270μs | 22.1161μs | 45.2160 KOps/s | 47.1209 KOps/s | |
test_set_nested_new | 77.7450μs | 26.5875μs | 37.6117 KOps/s | 38.6688 KOps/s | |
test_select | 0.1035ms | 43.0398μs | 23.2343 KOps/s | 24.3109 KOps/s | |
test_select_nested | 0.1306ms | 60.2383μs | 16.6007 KOps/s | 16.9008 KOps/s | |
test_exclude_nested | 0.1481ms | 75.9512μs | 13.1663 KOps/s | 13.4875 KOps/s | |
test_empty[True] | 0.6671ms | 0.3580ms | 2.7934 KOps/s | 2.8680 KOps/s | |
test_empty[False] | 6.8678μs | 1.2547μs | 796.9910 KOps/s | 751.9772 KOps/s | |
test_unbind_speed | 0.4536ms | 0.2657ms | 3.7638 KOps/s | 3.8766 KOps/s | |
test_unbind_speed_stack0 | 0.4743ms | 0.2534ms | 3.9464 KOps/s | 3.8842 KOps/s | |
test_unbind_speed_stack1 | 0.1018s | 0.7415ms | 1.3486 KOps/s | 1.4292 KOps/s | |
test_split | 97.2564ms | 1.7557ms | 569.5803 Ops/s | 585.0566 Ops/s | |
test_chunk | 97.6962ms | 1.7811ms | 561.4586 Ops/s | 584.3735 Ops/s | |
test_consolidate_njt[False-None] | 8.8027ms | 8.3389ms | 119.9203 Ops/s | 122.8384 Ops/s | |
test_creation[device0] | 0.2325ms | 90.9110μs | 10.9998 KOps/s | 11.0203 KOps/s | |
test_creation_from_tensor | 4.1418ms | 93.8475μs | 10.6556 KOps/s | 10.4600 KOps/s | |
test_add_one[memmap_tensor0] | 0.1794ms | 4.8428μs | 206.4930 KOps/s | 201.7856 KOps/s | |
test_contiguous[memmap_tensor0] | 16.6110μs | 0.5236μs | 1.9100 MOps/s | 1.8413 MOps/s | |
test_stack[memmap_tensor0] | 34.7950μs | 3.3854μs | 295.3837 KOps/s | 295.3992 KOps/s | |
test_memmaptd_index | 1.1420ms | 0.2406ms | 4.1561 KOps/s | 4.2184 KOps/s | |
test_memmaptd_index_astensor | 0.5697ms | 0.3106ms | 3.2191 KOps/s | 3.1817 KOps/s | |
test_memmaptd_index_op | 1.1017ms | 0.5860ms | 1.7065 KOps/s | 1.7120 KOps/s | |
test_serialize_model | 0.1263s | 0.1133s | 8.8286 Ops/s | 7.6138 Ops/s | |
test_serialize_model_pickle | 0.4485s | 0.3850s | 2.5972 Ops/s | 2.6046 Ops/s | |
test_serialize_weights | 0.2206s | 0.1260s | 7.9381 Ops/s | 8.9509 Ops/s | |
test_serialize_weights_returnearly | 0.1708s | 0.1558s | 6.4188 Ops/s | 6.4509 Ops/s | |
test_serialize_weights_pickle | 0.4639s | 0.3932s | 2.5431 Ops/s | 2.3790 Ops/s | |
test_serialize_weights_filesystem | 0.1457s | 0.1390s | 7.1953 Ops/s | 7.0681 Ops/s | |
test_serialize_model_filesystem | 0.1590s | 0.1485s | 6.7344 Ops/s | 6.6834 Ops/s | |
test_reshape_pytree | 59.5620μs | 26.8048μs | 37.3067 KOps/s | 38.2093 KOps/s | |
test_reshape_td | 71.8240μs | 33.1685μs | 30.1491 KOps/s | 31.7874 KOps/s | |
test_view_pytree | 77.8750μs | 27.2072μs | 36.7550 KOps/s | 38.0645 KOps/s | |
test_view_td | 93.2040μs | 39.1165μs | 25.5646 KOps/s | 26.8961 KOps/s | |
test_unbind_pytree | 71.7640μs | 30.0847μs | 33.2395 KOps/s | 34.1562 KOps/s | |
test_unbind_td | 0.3371ms | 38.8122μs | 25.7651 KOps/s | 26.6111 KOps/s | |
test_split_pytree | 64.5300μs | 30.0912μs | 33.2324 KOps/s | 34.0020 KOps/s | |
test_split_td | 98.8497ms | 53.8630μs | 18.5656 KOps/s | 23.0271 KOps/s | |
test_add_pytree | 0.1096ms | 35.6266μs | 28.0689 KOps/s | 28.2344 KOps/s | |
test_add_td | 0.1158ms | 57.1013μs | 17.5127 KOps/s | 18.2684 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1238ms | 60.6255μs | 16.4947 KOps/s | 16.0904 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3498ms | 0.1616ms | 6.1900 KOps/s | 6.2146 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1101ms | 44.9390μs | 22.2524 KOps/s | 21.7970 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2552ms | 0.1206ms | 8.2910 KOps/s | 8.4459 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 77.3650μs | 25.7046μs | 38.9036 KOps/s | 38.2603 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1390ms | 54.1371μs | 18.4716 KOps/s | 18.7861 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1625ms | 78.5990μs | 12.7228 KOps/s | 12.9707 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1866ms | 67.7762μs | 14.7544 KOps/s | 14.8381 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.6675ms | 0.1171ms | 8.5419 KOps/s | 9.5116 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3156ms | 0.2002ms | 4.9952 KOps/s | 5.0237 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1139ms | 44.9107μs | 22.2664 KOps/s | 22.2148 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.4625ms | 63.0909μs | 15.8501 KOps/s | 16.0873 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1747ms | 0.1029ms | 9.7218 KOps/s | 9.8336 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.3695ms | 0.2013ms | 4.9668 KOps/s | 4.9718 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4119ms | 0.2111ms | 4.7371 KOps/s | 4.7699 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.1742ms | 0.1043ms | 9.5916 KOps/s | 9.5945 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1253ms | 56.1689μs | 17.8034 KOps/s | 18.4749 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1006ms | 47.0802μs | 21.2403 KOps/s | 21.7492 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 5.2487ms | 0.1615ms | 6.1934 KOps/s | 6.2746 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1560ms | 0.1033ms | 9.6841 KOps/s | 9.7020 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 61.5550μs | 21.1617μs | 47.2553 KOps/s | 47.2371 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1169ms | 59.0804μs | 16.9261 KOps/s | 17.2456 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1868ms | 81.4539μs | 12.2769 KOps/s | 12.4224 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1598ms | 68.8312μs | 14.5283 KOps/s | 14.7390 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.4090ms | 0.2084ms | 4.7979 KOps/s | 4.8527 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.5040ms | 1.2917ms | 774.1574 Ops/s | 790.0747 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.2676ms | 0.2025ms | 4.9383 KOps/s | 5.0028 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.2210ms | 0.7779ms | 1.2854 KOps/s | 1.2783 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.7915ms | 0.4591ms | 2.1780 KOps/s | 2.2224 KOps/s | |
test_compile_assign_and_add_stack[eager] | 3.1473ms | 2.6895ms | 371.8224 Ops/s | 398.0245 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 81.2720μs | 36.9325μs | 27.0764 KOps/s | 27.3716 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5251ms | 32.3253μs | 30.9355 KOps/s | 30.2889 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 66.6340μs | 29.0325μs | 34.4441 KOps/s | 33.5857 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 67.7160μs | 23.4552μs | 42.6345 KOps/s | 42.0748 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 85.3900μs | 30.3061μs | 32.9966 KOps/s | 32.7479 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 61.9360μs | 23.2403μs | 43.0286 KOps/s | 43.2489 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1147ms | 51.3574μs | 19.4714 KOps/s | 19.6150 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5542ms | 19.6532μs | 50.8823 KOps/s | 51.1257 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1093ms | 43.9207μs | 22.7683 KOps/s | 22.9335 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 47.1880μs | 19.2276μs | 52.0086 KOps/s | 53.3233 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1061ms | 45.1567μs | 22.1451 KOps/s | 22.4159 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 52.7980μs | 19.1991μs | 52.0859 KOps/s | 53.3309 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1105ms | 53.4919μs | 18.6944 KOps/s | 19.0999 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.9614ms | 19.8197μs | 50.4548 KOps/s | 51.7014 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 96.0800μs | 45.0097μs | 22.2174 KOps/s | 22.4351 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 58.4700μs | 19.2800μs | 51.8671 KOps/s | 53.7606 KOps/s | |
test_compile_indexing[int-pytree-compile] | 94.5570μs | 45.3578μs | 22.0469 KOps/s | 22.6201 KOps/s | |
test_compile_indexing[int-pytree-eager] | 58.3490μs | 19.0509μs | 52.4910 KOps/s | 53.8578 KOps/s | |
test_mod_add[eager] | 75.5920μs | 27.3978μs | 36.4992 KOps/s | 38.3044 KOps/s | |
test_mod_add[compile] | 90.5990μs | 44.9624μs | 22.2408 KOps/s | 22.3350 KOps/s | |
test_mod_add[compile-overhead] | 87.6640μs | 45.0833μs | 22.1811 KOps/s | 22.4350 KOps/s | |
test_mod_wrap[eager] | 0.6719ms | 0.2157ms | 4.6369 KOps/s | 4.6332 KOps/s | |
test_mod_wrap[compile] | 1.2886ms | 0.2037ms | 4.9103 KOps/s | 4.8632 KOps/s | |
test_mod_wrap[compile-overhead] | 1.6626ms | 0.2030ms | 4.9269 KOps/s | 4.8199 KOps/s | |
test_mod_wrap_and_backward[eager] | 11.8140ms | 10.8343ms | 92.2996 Ops/s | 91.7444 Ops/s | |
test_mod_wrap_and_backward[compile] | 12.4797ms | 10.7247ms | 93.2427 Ops/s | 94.3835 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 11.9697ms | 10.6441ms | 93.9487 Ops/s | 95.0709 Ops/s | |
test_seq_add[eager] | 0.2001ms | 94.0605μs | 10.6315 KOps/s | 10.9676 KOps/s | |
test_seq_add[compile] | 0.1199ms | 60.2075μs | 16.6092 KOps/s | 16.7392 KOps/s | |
test_seq_add[compile-overhead] | 0.1115ms | 58.2277μs | 17.1740 KOps/s | 16.9321 KOps/s | |
test_seq_wrap[eager] | 0.5259ms | 0.3873ms | 2.5820 KOps/s | 2.5365 KOps/s | |
test_seq_wrap[compile] | 0.3517ms | 0.2282ms | 4.3819 KOps/s | 4.3120 KOps/s | |
test_seq_wrap[compile-overhead] | 0.4314ms | 0.2242ms | 4.4598 KOps/s | 4.3856 KOps/s | |
test_func_call_runtime[False-eager] | 0.9538ms | 0.5491ms | 1.8213 KOps/s | 1.7847 KOps/s | |
test_func_call_runtime[False-compile] | 0.5682ms | 0.4245ms | 2.3556 KOps/s | 2.3282 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.7560ms | 0.4268ms | 2.3432 KOps/s | 2.3184 KOps/s | |
test_func_call_runtime[True-eager] | 0.8700ms | 0.7499ms | 1.3335 KOps/s | 1.2705 KOps/s | |
test_func_call_runtime[True-compile] | 0.9526ms | 0.4657ms | 2.1473 KOps/s | 2.1399 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.8449ms | 0.4674ms | 2.1393 KOps/s | 2.1493 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8949ms | 0.5436ms | 1.8397 KOps/s | 1.7506 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8564ms | 0.4240ms | 2.3586 KOps/s | 2.3458 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5842ms | 0.4241ms | 2.3580 KOps/s | 2.3408 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1145ms | 0.8948ms | 1.1176 KOps/s | 1.1175 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.6285ms | 0.4949ms | 2.0205 KOps/s | 2.0253 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.8938ms | 0.4957ms | 2.0174 KOps/s | 2.0300 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.3532ms | 1.8544ms | 539.2523 Ops/s | 530.3236 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.7101ms | 0.5214ms | 1.9179 KOps/s | 1.9052 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.7017ms | 0.5204ms | 1.9217 KOps/s | 1.9209 KOps/s | |
test_distributed | 0.9613ms | 0.1326ms | 7.5422 KOps/s | 7.8528 KOps/s | |
test_tdmodule | 77.7450μs | 18.2473μs | 54.8026 KOps/s | 54.9165 KOps/s | |
test_tdmodule_dispatch | 51.7170μs | 36.0248μs | 27.7586 KOps/s | 28.1750 KOps/s | |
test_tdseq | 46.3170μs | 21.0472μs | 47.5123 KOps/s | 48.0856 KOps/s | |
test_tdseq_dispatch | 82.9550μs | 44.5254μs | 22.4591 KOps/s | 24.4141 KOps/s | |
test_instantiation_functorch | 2.2950ms | 1.5506ms | 644.9249 Ops/s | 662.5899 Ops/s | |
test_exec_functorch | 0.2864ms | 0.1822ms | 5.4883 KOps/s | 5.5071 KOps/s | |
test_exec_functional_call | 0.3257ms | 0.1717ms | 5.8246 KOps/s | 5.5751 KOps/s | |
test_exec_td_decorator | 0.5915ms | 0.2296ms | 4.3546 KOps/s | 4.3362 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.6151ms | 0.6292ms | 1.5893 KOps/s | 1.5013 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9672ms | 0.6295ms | 1.5885 KOps/s | 1.5620 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.9593ms | 0.5154ms | 1.9401 KOps/s | 1.8995 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8241ms | 0.5134ms | 1.9478 KOps/s | 1.9004 KOps/s | |
test_to_module_speed[True] | 1.5041ms | 1.2925ms | 773.7018 Ops/s | 776.4188 Ops/s | |
test_to_module_speed[False] | 2.0059ms | 1.2697ms | 787.5990 Ops/s | 798.3712 Ops/s | |
test_tc_init | 79.4280μs | 43.7530μs | 22.8556 KOps/s | 22.9349 KOps/s | |
test_tc_init_nested | 0.1518ms | 88.8465μs | 11.2554 KOps/s | 11.3727 KOps/s | |
test_tc_first_layer_tensor | 24.5160μs | 1.5596μs | 641.1780 KOps/s | 665.0485 KOps/s | |
test_tc_first_layer_nontensor | 50.4140μs | 4.8054μs | 208.0988 KOps/s | 214.1273 KOps/s | |
test_tc_second_layer_tensor | 47.0580μs | 2.8449μs | 351.5053 KOps/s | 359.0215 KOps/s | |
test_tc_second_layer_nontensor | 61.4250μs | 6.1283μs | 163.1780 KOps/s | 166.1986 KOps/s | |
test_unbind | 0.2249s | 12.4908ms | 80.0591 Ops/s | 87.3822 Ops/s | |
test_full_like | 7.9489ms | 7.1517ms | 139.8260 Ops/s | 88.4933 Ops/s | |
test_zeros_like | 12.9053ms | 6.3844ms | 156.6308 Ops/s | 125.2382 Ops/s | |
test_ones_like | 13.0506ms | 7.5905ms | 131.7440 Ops/s | 125.8271 Ops/s | |
test_clone | 13.8977ms | 9.2874ms | 107.6732 Ops/s | 103.5734 Ops/s | |
test_squeeze | 58.5190μs | 12.3699μs | 80.8416 KOps/s | 85.1683 KOps/s | |
test_unsqueeze | 0.1625ms | 89.4930μs | 11.1741 KOps/s | 11.6061 KOps/s | |
test_split | 0.5166ms | 0.1929ms | 5.1842 KOps/s | 5.4756 KOps/s | |
test_permute | 0.4109ms | 0.2179ms | 4.5895 KOps/s | 4.6264 KOps/s | |
test_stack | 30.4619ms | 25.2680ms | 39.5757 Ops/s | 39.2565 Ops/s | |
test_cat | 33.3582ms | 25.5577ms | 39.1271 Ops/s | 40.6098 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):