-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Refactor] Improve functional call efficiency #567
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Nov 22, 2023
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 49.3920μs | 16.3499μs | 61.1625 KOps/s | 60.8033 KOps/s | |
test_plain_set_stack_nested | 0.2911ms | 0.1485ms | 6.7335 KOps/s | 6.7729 KOps/s | |
test_plain_set_nested_inplace | 69.2480μs | 19.3836μs | 51.5900 KOps/s | 51.4667 KOps/s | |
test_plain_set_stack_nested_inplace | 0.4063ms | 0.1757ms | 5.6900 KOps/s | 5.7172 KOps/s | |
test_items | 0.1627ms | 2.6808μs | 373.0212 KOps/s | 405.0031 KOps/s | |
test_items_nested | 1.3161ms | 0.2731ms | 3.6617 KOps/s | 3.6793 KOps/s | |
test_items_nested_locked | 0.3928ms | 0.2697ms | 3.7077 KOps/s | 3.7163 KOps/s | |
test_items_nested_leaf | 0.8663ms | 0.1736ms | 5.7593 KOps/s | 6.0094 KOps/s | |
test_items_stack_nested | 2.6182ms | 1.5238ms | 656.2486 Ops/s | 684.7777 Ops/s | |
test_items_stack_nested_leaf | 2.5099ms | 1.4151ms | 706.6753 Ops/s | 754.9729 Ops/s | |
test_items_stack_nested_locked | 1.1996ms | 0.7626ms | 1.3113 KOps/s | 1.2988 KOps/s | |
test_keys | 41.6180μs | 4.8424μs | 206.5080 KOps/s | 255.3977 KOps/s | |
test_keys_nested | 1.5238ms | 0.1509ms | 6.6252 KOps/s | 6.7517 KOps/s | |
test_keys_nested_locked | 0.2670ms | 0.1411ms | 7.0877 KOps/s | 7.0900 KOps/s | |
test_keys_nested_leaf | 0.3867ms | 0.1407ms | 7.1048 KOps/s | 7.2393 KOps/s | |
test_keys_stack_nested | 1.8532ms | 1.4088ms | 709.8094 Ops/s | 727.3771 Ops/s | |
test_keys_stack_nested_leaf | 1.5072ms | 1.4035ms | 712.4796 Ops/s | 723.6928 Ops/s | |
test_keys_stack_nested_locked | 1.3501ms | 0.6610ms | 1.5128 KOps/s | 1.4748 KOps/s | |
test_values | 21.9283μs | 1.1628μs | 859.9575 KOps/s | 859.4670 KOps/s | |
test_values_nested | 95.8180μs | 50.2259μs | 19.9100 KOps/s | 20.2167 KOps/s | |
test_values_nested_locked | 97.2900μs | 49.9500μs | 20.0200 KOps/s | 20.0410 KOps/s | |
test_values_nested_leaf | 75.4600μs | 44.4162μs | 22.5143 KOps/s | 22.3275 KOps/s | |
test_values_stack_nested | 1.3409ms | 1.2076ms | 828.0931 Ops/s | 851.6675 Ops/s | |
test_values_stack_nested_leaf | 1.4246ms | 1.1900ms | 840.3581 Ops/s | 859.6559 Ops/s | |
test_values_stack_nested_locked | 0.6945ms | 0.5060ms | 1.9765 KOps/s | 1.9328 KOps/s | |
test_membership | 22.2610μs | 1.3507μs | 740.3711 KOps/s | 726.3895 KOps/s | |
test_membership_nested | 42.8900μs | 2.8028μs | 356.7881 KOps/s | 355.1947 KOps/s | |
test_membership_nested_leaf | 20.4080μs | 2.7785μs | 359.9057 KOps/s | 353.0000 KOps/s | |
test_membership_stacked_nested | 56.5750μs | 12.1121μs | 82.5619 KOps/s | 84.1272 KOps/s | |
test_membership_stacked_nested_leaf | 0.1187ms | 12.9604μs | 77.1581 KOps/s | 84.8124 KOps/s | |
test_membership_nested_last | 49.1010μs | 5.9223μs | 168.8527 KOps/s | 168.9552 KOps/s | |
test_membership_nested_leaf_last | 32.5100μs | 5.8923μs | 169.7128 KOps/s | 167.6926 KOps/s | |
test_membership_stacked_nested_last | 0.2853ms | 0.1695ms | 5.9002 KOps/s | 5.8767 KOps/s | |
test_membership_stacked_nested_leaf_last | 65.5690μs | 14.1912μs | 70.4660 KOps/s | 71.9069 KOps/s | |
test_nested_getleaf | 30.8170μs | 10.8293μs | 92.3425 KOps/s | 93.9048 KOps/s | |
test_nested_get | 34.1130μs | 10.2198μs | 97.8494 KOps/s | 98.7061 KOps/s | |
test_stacked_getleaf | 1.1213ms | 0.6495ms | 1.5396 KOps/s | 1.6099 KOps/s | |
test_stacked_get | 1.2159ms | 0.6131ms | 1.6310 KOps/s | 1.6942 KOps/s | |
test_nested_getitemleaf | 0.1270ms | 11.2172μs | 89.1491 KOps/s | 93.1218 KOps/s | |
test_nested_getitem | 42.8290μs | 10.0364μs | 99.6374 KOps/s | 98.1196 KOps/s | |
test_stacked_getitemleaf | 1.2130ms | 0.6500ms | 1.5385 KOps/s | 1.5472 KOps/s | |
test_stacked_getitem | 1.0322ms | 0.6122ms | 1.6333 KOps/s | 1.6941 KOps/s | |
test_lock_nested | 61.6082ms | 0.5543ms | 1.8041 KOps/s | 2.0121 KOps/s | |
test_lock_stack_nested | 91.3765ms | 8.9422ms | 111.8289 Ops/s | 117.1341 Ops/s | |
test_unlock_nested | 71.0665ms | 0.5176ms | 1.9321 KOps/s | 1.9421 KOps/s | |
test_unlock_stack_nested | 79.0098ms | 8.4626ms | 118.1669 Ops/s | 202.9720 Ops/s | |
test_flatten_speed | 0.7222ms | 0.2705ms | 3.6973 KOps/s | 3.6769 KOps/s | |
test_unflatten_speed | 1.2667ms | 0.4728ms | 2.1149 KOps/s | 2.1077 KOps/s | |
test_common_ops | 1.4755ms | 0.6937ms | 1.4416 KOps/s | 1.4068 KOps/s | |
test_creation | 26.2480μs | 2.4419μs | 409.5184 KOps/s | 405.5477 KOps/s | |
test_creation_empty | 38.2920μs | 8.5579μs | 116.8509 KOps/s | 112.0649 KOps/s | |
test_creation_nested_1 | 25.9380μs | 12.1243μs | 82.4792 KOps/s | 77.3122 KOps/s | |
test_creation_nested_2 | 65.8220μs | 15.3840μs | 65.0024 KOps/s | 61.4068 KOps/s | |
test_clone | 0.2143ms | 13.4427μs | 74.3899 KOps/s | 76.0741 KOps/s | |
test_getitem[int] | 58.8100μs | 12.6932μs | 78.7822 KOps/s | 77.9121 KOps/s | |
test_getitem[slice_int] | 69.6500μs | 24.6366μs | 40.5901 KOps/s | 40.6617 KOps/s | |
test_getitem[range] | 96.6090μs | 43.1414μs | 23.1796 KOps/s | 22.5144 KOps/s | |
test_getitem[tuple] | 53.2680μs | 20.4100μs | 48.9957 KOps/s | 49.8519 KOps/s | |
test_getitem[list] | 80.3980μs | 38.7185μs | 25.8275 KOps/s | 24.9051 KOps/s | |
test_setitem_dim[int] | 47.3470μs | 27.4519μs | 36.4274 KOps/s | 36.0508 KOps/s | |
test_setitem_dim[slice_int] | 84.3760μs | 51.8672μs | 19.2800 KOps/s | 19.3474 KOps/s | |
test_setitem_dim[range] | 0.1212ms | 72.0116μs | 13.8867 KOps/s | 13.7889 KOps/s | |
test_setitem_dim[tuple] | 0.1135ms | 40.9736μs | 24.4060 KOps/s | 24.3050 KOps/s | |
test_setitem | 0.2502ms | 18.7232μs | 53.4097 KOps/s | 53.4992 KOps/s | |
test_set | 0.2319ms | 18.1232μs | 55.1778 KOps/s | 55.1898 KOps/s | |
test_set_shared | 3.3573ms | 0.1408ms | 7.1033 KOps/s | 7.2263 KOps/s | |
test_update | 0.2095ms | 24.4040μs | 40.9769 KOps/s | 41.4602 KOps/s | |
test_update_nested | 0.3807ms | 36.6963μs | 27.2507 KOps/s | 27.8001 KOps/s | |
test_set_nested | 0.2146ms | 19.8791μs | 50.3041 KOps/s | 50.0880 KOps/s | |
test_set_nested_new | 0.2211ms | 25.3880μs | 39.3887 KOps/s | 37.9763 KOps/s | |
test_select | 98.5830μs | 49.7152μs | 20.1146 KOps/s | 19.2303 KOps/s | |
test_unbind_speed | 0.4633ms | 0.3685ms | 2.7138 KOps/s | 2.6979 KOps/s | |
test_unbind_speed_stack0 | 69.7567ms | 5.6275ms | 177.6997 Ops/s | 175.6044 Ops/s | |
test_unbind_speed_stack1 | 1.6030μs | 0.6445μs | 1.5517 MOps/s | 1.5998 MOps/s | |
test_split | 1.7025ms | 1.6314ms | 612.9627 Ops/s | 611.4281 Ops/s | |
test_chunk | 62.3735ms | 1.7439ms | 573.4190 Ops/s | 569.1813 Ops/s | |
test_creation[device0] | 3.6752ms | 0.3027ms | 3.3034 KOps/s | 3.1244 KOps/s | |
test_creation_from_tensor | 60.0502ms | 0.3677ms | 2.7198 KOps/s | 2.6764 KOps/s | |
test_add_one[memmap_tensor0] | 70.4610μs | 24.4884μs | 40.8357 KOps/s | 39.8896 KOps/s | |
test_contiguous[memmap_tensor0] | 33.1810μs | 5.6485μs | 177.0374 KOps/s | 175.4126 KOps/s | |
test_stack[memmap_tensor0] | 67.1740μs | 18.6293μs | 53.6789 KOps/s | 51.6743 KOps/s | |
test_memmaptd_index | 0.4305ms | 0.1879ms | 5.3210 KOps/s | 5.2488 KOps/s | |
test_memmaptd_index_astensor | 0.4081ms | 0.2499ms | 4.0015 KOps/s | 3.9370 KOps/s | |
test_memmaptd_index_op | 0.9349ms | 0.4864ms | 2.0560 KOps/s | 2.0120 KOps/s | |
test_reshape_pytree | 52.0570μs | 23.2417μs | 43.0260 KOps/s | 42.9671 KOps/s | |
test_reshape_td | 73.2360μs | 31.2186μs | 32.0322 KOps/s | 30.9588 KOps/s | |
test_view_pytree | 52.5280μs | 23.3549μs | 42.8176 KOps/s | 43.4072 KOps/s | |
test_view_td | 22.6920μs | 4.8195μs | 207.4895 KOps/s | 206.8551 KOps/s | |
test_unbind_pytree | 0.2067ms | 28.6962μs | 34.8478 KOps/s | 38.0687 KOps/s | |
test_unbind_td | 0.1092ms | 58.5145μs | 17.0898 KOps/s | 16.8906 KOps/s | |
test_split_pytree | 56.4650μs | 26.3504μs | 37.9501 KOps/s | 37.6899 KOps/s | |
test_split_td | 0.1323ms | 45.5219μs | 21.9674 KOps/s | 21.5815 KOps/s | |
test_add_pytree | 79.1470μs | 31.7549μs | 31.4912 KOps/s | 31.4766 KOps/s | |
test_add_td | 0.1422ms | 44.1417μs | 22.6543 KOps/s | 21.8671 KOps/s | |
test_distributed | 17.8530μs | 6.3633μs | 157.1500 KOps/s | 168.0954 KOps/s | |
test_tdmodule | 0.1066ms | 21.1744μs | 47.2267 KOps/s | 46.4830 KOps/s | |
test_tdmodule_dispatch | 0.1800ms | 39.0708μs | 25.5946 KOps/s | 25.2814 KOps/s | |
test_tdseq | 50.1830μs | 24.5409μs | 40.7483 KOps/s | 40.9599 KOps/s | |
test_tdseq_dispatch | 0.1347ms | 43.4939μs | 22.9917 KOps/s | 22.7276 KOps/s | |
test_instantiation_functorch | 1.3813ms | 1.2806ms | 780.9079 Ops/s | 777.5176 Ops/s | |
test_instantiation_td | 1.4325ms | 0.9992ms | 1.0008 KOps/s | 995.5540 Ops/s | |
test_exec_functorch | 0.3485ms | 0.1617ms | 6.1855 KOps/s | 6.3716 KOps/s | |
test_exec_functional_call | 0.2255ms | 0.1439ms | 6.9490 KOps/s | 6.8448 KOps/s | |
test_exec_td | 0.4636ms | 0.1435ms | 6.9669 KOps/s | 7.0820 KOps/s | |
test_exec_td_decorator | 1.0311ms | 0.2171ms | 4.6054 KOps/s | 4.0619 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.2534ms | 0.8935ms | 1.1192 KOps/s | 1.1242 KOps/s | |
test_vmap_mlp_speed[True-False] | 1.3172ms | 0.4848ms | 2.0628 KOps/s | 2.1430 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.2179ms | 0.7777ms | 1.2859 KOps/s | 1.2859 KOps/s | |
test_vmap_mlp_speed[False-False] | 1.0457ms | 0.4056ms | 2.4656 KOps/s | 2.5991 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 2.2594ms | 1.5668ms | 638.2491 Ops/s | 540.0427 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9945ms | 0.5492ms | 1.8208 KOps/s | 1.7247 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 1.8058ms | 1.3554ms | 737.7642 Ops/s | 619.8881 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.9009ms | 0.4251ms | 2.3526 KOps/s | 2.1978 KOps/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.5600ms | 12.4435μs | 80.3634 KOps/s | 78.8103 KOps/s | |
test_plain_set_stack_nested | 0.1368ms | 0.1140ms | 8.7732 KOps/s | 8.2374 KOps/s | |
test_plain_set_nested_inplace | 31.9910μs | 14.9392μs | 66.9382 KOps/s | 66.7472 KOps/s | |
test_plain_set_stack_nested_inplace | 0.1737ms | 0.1398ms | 7.1525 KOps/s | 7.1543 KOps/s | |
test_items | 27.1900μs | 4.7468μs | 210.6672 KOps/s | 211.1098 KOps/s | |
test_items_nested | 0.3738ms | 0.3383ms | 2.9557 KOps/s | 2.9742 KOps/s | |
test_items_nested_locked | 0.3587ms | 0.3368ms | 2.9691 KOps/s | 2.9601 KOps/s | |
test_items_nested_leaf | 0.2221ms | 0.1988ms | 5.0293 KOps/s | 5.0367 KOps/s | |
test_items_stack_nested | 1.5322ms | 1.4929ms | 669.8484 Ops/s | 687.0648 Ops/s | |
test_items_stack_nested_leaf | 1.3610ms | 1.3096ms | 763.5654 Ops/s | 779.9794 Ops/s | |
test_items_stack_nested_locked | 0.8597ms | 0.8185ms | 1.2217 KOps/s | 1.2188 KOps/s | |
test_keys | 25.4400μs | 4.7912μs | 208.7149 KOps/s | 216.1124 KOps/s | |
test_keys_nested | 1.2051ms | 89.8260μs | 11.1326 KOps/s | 11.0851 KOps/s | |
test_keys_nested_locked | 0.1201ms | 89.2809μs | 11.2006 KOps/s | 11.1176 KOps/s | |
test_keys_nested_leaf | 42.3150ms | 85.9087μs | 11.6403 KOps/s | 12.2449 KOps/s | |
test_keys_stack_nested | 1.3565ms | 1.3065ms | 765.4025 Ops/s | 786.6514 Ops/s | |
test_keys_stack_nested_leaf | 1.3327ms | 1.3013ms | 768.4577 Ops/s | 783.5534 Ops/s | |
test_keys_stack_nested_locked | 0.6776ms | 0.6233ms | 1.6043 KOps/s | 1.6142 KOps/s | |
test_values | 9.2600μs | 1.9002μs | 526.2615 KOps/s | 524.0092 KOps/s | |
test_values_nested | 73.3410μs | 43.0276μs | 23.2409 KOps/s | 23.1851 KOps/s | |
test_values_nested_locked | 66.2310μs | 43.1296μs | 23.1859 KOps/s | 22.9951 KOps/s | |
test_values_nested_leaf | 64.2710μs | 37.2400μs | 26.8528 KOps/s | 26.6395 KOps/s | |
test_values_stack_nested | 1.1891ms | 1.1482ms | 870.9081 Ops/s | 903.7586 Ops/s | |
test_values_stack_nested_leaf | 1.1615ms | 1.1255ms | 888.4834 Ops/s | 906.3536 Ops/s | |
test_values_stack_nested_locked | 0.5307ms | 0.4962ms | 2.0153 KOps/s | 2.0166 KOps/s | |
test_membership | 5.4380μs | 0.9341μs | 1.0705 MOps/s | 1.0505 MOps/s | |
test_membership_nested | 13.4455μs | 2.1409μs | 467.0866 KOps/s | 472.6314 KOps/s | |
test_membership_nested_leaf | 16.8500μs | 2.1365μs | 468.0498 KOps/s | 472.9976 KOps/s | |
test_membership_stacked_nested | 45.8510μs | 10.8318μs | 92.3208 KOps/s | 92.2587 KOps/s | |
test_membership_stacked_nested_leaf | 33.1900μs | 10.9361μs | 91.4400 KOps/s | 92.1949 KOps/s | |
test_membership_nested_last | 23.4800μs | 4.6464μs | 215.2195 KOps/s | 215.9088 KOps/s | |
test_membership_nested_leaf_last | 43.5610μs | 4.6473μs | 215.1765 KOps/s | 216.8417 KOps/s | |
test_membership_stacked_nested_last | 0.1699ms | 0.1348ms | 7.4157 KOps/s | 7.5515 KOps/s | |
test_membership_stacked_nested_leaf_last | 43.6600μs | 12.6286μs | 79.1851 KOps/s | 78.0133 KOps/s | |
test_nested_getleaf | 28.9900μs | 8.4037μs | 118.9951 KOps/s | 119.2803 KOps/s | |
test_nested_get | 33.1700μs | 7.9474μs | 125.8273 KOps/s | 125.7483 KOps/s | |
test_stacked_getleaf | 0.6280ms | 0.5681ms | 1.7602 KOps/s | 1.8373 KOps/s | |
test_stacked_get | 0.5863ms | 0.5419ms | 1.8453 KOps/s | 1.9479 KOps/s | |
test_nested_getitemleaf | 30.0010μs | 8.4310μs | 118.6103 KOps/s | 117.9235 KOps/s | |
test_nested_getitem | 32.2210μs | 7.9841μs | 125.2487 KOps/s | 125.0657 KOps/s | |
test_stacked_getitemleaf | 0.6516ms | 0.5762ms | 1.7354 KOps/s | 1.8302 KOps/s | |
test_stacked_getitem | 0.5665ms | 0.5385ms | 1.8569 KOps/s | 1.9454 KOps/s | |
test_lock_nested | 4.3898ms | 0.4591ms | 2.1780 KOps/s | 2.1683 KOps/s | |
test_lock_stack_nested | 70.2399ms | 6.6020ms | 151.4682 Ops/s | 149.0229 Ops/s | |
test_unlock_nested | 1.2830ms | 0.4360ms | 2.2937 KOps/s | 2.0005 KOps/s | |
test_unlock_stack_nested | 67.9921ms | 7.3462ms | 136.1256 Ops/s | 135.6508 Ops/s | |
test_flatten_speed | 0.5160ms | 0.1860ms | 5.3753 KOps/s | 5.2410 KOps/s | |
test_unflatten_speed | 0.4136ms | 0.3617ms | 2.7649 KOps/s | 2.6809 KOps/s | |
test_common_ops | 1.0144ms | 0.6037ms | 1.6563 KOps/s | 1.5813 KOps/s | |
test_creation | 18.8900μs | 1.9256μs | 519.3206 KOps/s | 526.2617 KOps/s | |
test_creation_empty | 21.3010μs | 6.5755μs | 152.0803 KOps/s | 144.4321 KOps/s | |
test_creation_nested_1 | 24.0000μs | 8.9291μs | 111.9935 KOps/s | 104.8845 KOps/s | |
test_creation_nested_2 | 30.4200μs | 11.4306μs | 87.4844 KOps/s | 82.6993 KOps/s | |
test_clone | 31.0900μs | 14.3546μs | 69.6643 KOps/s | 70.2266 KOps/s | |
test_getitem[int] | 56.3310μs | 12.1110μs | 82.5697 KOps/s | 82.0592 KOps/s | |
test_getitem[slice_int] | 49.1710μs | 23.7430μs | 42.1177 KOps/s | 43.2061 KOps/s | |
test_getitem[range] | 63.8910μs | 40.2647μs | 24.8356 KOps/s | 25.3155 KOps/s | |
test_getitem[tuple] | 97.9810μs | 20.1114μs | 49.7231 KOps/s | 49.8081 KOps/s | |
test_getitem[list] | 0.3321ms | 37.2574μs | 26.8403 KOps/s | 27.0289 KOps/s | |
test_setitem_dim[int] | 51.1710μs | 26.5841μs | 37.6165 KOps/s | 38.2038 KOps/s | |
test_setitem_dim[slice_int] | 64.4110μs | 46.3562μs | 21.5721 KOps/s | 21.8220 KOps/s | |
test_setitem_dim[range] | 82.9820μs | 63.9276μs | 15.6427 KOps/s | 15.9622 KOps/s | |
test_setitem_dim[tuple] | 57.0300μs | 40.0798μs | 24.9502 KOps/s | 25.4376 KOps/s | |
test_setitem | 0.1348ms | 18.2273μs | 54.8627 KOps/s | 55.9126 KOps/s | |
test_set | 0.1130ms | 17.7122μs | 56.4584 KOps/s | 57.5634 KOps/s | |
test_set_shared | 3.2710ms | 0.1024ms | 9.7676 KOps/s | 9.2968 KOps/s | |
test_update | 79.4810μs | 21.4295μs | 46.6646 KOps/s | 43.1235 KOps/s | |
test_update_nested | 0.1344ms | 31.1642μs | 32.0881 KOps/s | 31.8821 KOps/s | |
test_set_nested | 0.1117ms | 18.7856μs | 53.2323 KOps/s | 52.8417 KOps/s | |
test_set_nested_new | 0.1297ms | 23.0662μs | 43.3536 KOps/s | 41.9758 KOps/s | |
test_select | 69.4810μs | 45.4799μs | 21.9877 KOps/s | 20.8797 KOps/s | |
test_to | 74.5910μs | 53.2820μs | 18.7681 KOps/s | 18.7561 KOps/s | |
test_to_nonblocking | 64.8010μs | 34.8789μs | 28.6706 KOps/s | 28.5955 KOps/s | |
test_unbind_speed | 0.3724ms | 0.3568ms | 2.8027 KOps/s | 2.8332 KOps/s | |
test_unbind_speed_stack0 | 66.4712ms | 5.2004ms | 192.2934 Ops/s | 192.3873 Ops/s | |
test_unbind_speed_stack1 | 3.4940μs | 0.5216μs | 1.9170 MOps/s | 1.8938 MOps/s | |
test_split | 53.9143ms | 1.8135ms | 551.4224 Ops/s | 552.2271 Ops/s | |
test_chunk | 53.4272ms | 1.7995ms | 555.6988 Ops/s | 556.8926 Ops/s | |
test_creation[device0] | 0.4964ms | 0.3079ms | 3.2480 KOps/s | 3.2562 KOps/s | |
test_creation[device1] | 0.7881ms | 0.3114ms | 3.2111 KOps/s | 3.2239 KOps/s | |
test_creation_from_tensor | 0.6377ms | 0.3372ms | 2.9655 KOps/s | 2.7469 KOps/s | |
test_add_one[memmap_tensor0] | 60.3610μs | 24.7806μs | 40.3541 KOps/s | 40.8849 KOps/s | |
test_add_one[memmap_tensor1] | 0.2126ms | 74.9794μs | 13.3370 KOps/s | 13.4492 KOps/s | |
test_contiguous[memmap_tensor0] | 22.7800μs | 6.1070μs | 163.7462 KOps/s | 169.7714 KOps/s | |
test_contiguous[memmap_tensor1] | 47.8400μs | 22.7212μs | 44.0118 KOps/s | 44.3417 KOps/s | |
test_stack[memmap_tensor0] | 50.0310μs | 20.2392μs | 49.4090 KOps/s | 49.4225 KOps/s | |
test_stack[memmap_tensor1] | 0.1516ms | 74.8658μs | 13.3572 KOps/s | 13.5224 KOps/s | |
test_memmaptd_index | 0.2687ms | 0.2243ms | 4.4578 KOps/s | 4.5213 KOps/s | |
test_memmaptd_index_astensor | 0.3170ms | 0.2781ms | 3.5953 KOps/s | 3.4951 KOps/s | |
test_memmaptd_index_op | 0.5958ms | 0.5401ms | 1.8515 KOps/s | 1.8346 KOps/s | |
test_reshape_pytree | 37.5410μs | 20.9622μs | 47.7050 KOps/s | 47.9799 KOps/s | |
test_reshape_td | 56.3910μs | 30.3328μs | 32.9676 KOps/s | 32.8760 KOps/s | |
test_view_pytree | 40.1910μs | 20.6539μs | 48.4170 KOps/s | 48.6524 KOps/s | |
test_view_td | 21.2210μs | 4.0757μs | 245.3572 KOps/s | 244.2347 KOps/s | |
test_unbind_pytree | 44.9410μs | 25.9160μs | 38.5862 KOps/s | 38.0345 KOps/s | |
test_unbind_td | 75.3020μs | 55.5101μs | 18.0147 KOps/s | 17.8492 KOps/s | |
test_split_pytree | 51.6300μs | 24.7577μs | 40.3915 KOps/s | 40.6124 KOps/s | |
test_split_td | 67.0910μs | 45.2003μs | 22.1238 KOps/s | 22.4545 KOps/s | |
test_add_pytree | 60.2510μs | 32.6394μs | 30.6378 KOps/s | 30.8746 KOps/s | |
test_add_td | 66.0310μs | 43.1110μs | 23.1960 KOps/s | 22.7016 KOps/s | |
test_distributed | 23.8800μs | 5.4491μs | 183.5165 KOps/s | 181.1962 KOps/s | |
test_tdmodule | 31.8910μs | 16.5140μs | 60.5546 KOps/s | 59.1035 KOps/s | |
test_tdmodule_dispatch | 0.1308ms | 32.0093μs | 31.2409 KOps/s | 30.6072 KOps/s | |
test_tdseq | 36.3200μs | 19.9486μs | 50.1288 KOps/s | 49.4903 KOps/s | |
test_tdseq_dispatch | 0.1318ms | 35.1524μs | 28.4475 KOps/s | 27.7594 KOps/s | |
test_instantiation_functorch | 2.0295ms | 1.6961ms | 589.5974 Ops/s | 597.7524 Ops/s | |
test_instantiation_td | 1.6707ms | 1.1903ms | 840.0944 Ops/s | 852.0547 Ops/s | |
test_exec_functorch | 0.2032ms | 0.1628ms | 6.1434 KOps/s | 6.2042 KOps/s | |
test_exec_functional_call | 0.2278ms | 0.1639ms | 6.0996 KOps/s | 6.2171 KOps/s | |
test_exec_td | 0.1965ms | 0.1545ms | 6.4714 KOps/s | 6.5071 KOps/s | |
test_exec_td_decorator | 0.7872ms | 0.2264ms | 4.4165 KOps/s | 3.9988 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.1511ms | 1.0792ms | 926.6249 Ops/s | 939.5392 Ops/s | |
test_vmap_mlp_speed[True-False] | 0.8070ms | 0.6171ms | 1.6204 KOps/s | 1.6241 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.1154ms | 1.0402ms | 961.3480 Ops/s | 1.0270 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6150ms | 0.5460ms | 1.8316 KOps/s | 1.8320 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 2.6243ms | 1.7956ms | 556.9136 Ops/s | 482.6189 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.1706ms | 0.6905ms | 1.4482 KOps/s | 1.4070 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 2.1119ms | 1.6157ms | 618.9403 Ops/s | 534.6432 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 1.0170ms | 0.5879ms | 1.7008 KOps/s | 1.6464 KOps/s | |
test_vmap_transformer_speed[True-True] | 12.7199ms | 12.6224ms | 79.2242 Ops/s | 79.1637 Ops/s | |
test_vmap_transformer_speed[True-False] | 8.7169ms | 8.2975ms | 120.5176 Ops/s | 120.7391 Ops/s | |
test_vmap_transformer_speed[False-True] | 12.6758ms | 12.5791ms | 79.4968 Ops/s | 80.1661 Ops/s | |
test_vmap_transformer_speed[False-False] | 8.3318ms | 8.2351ms | 121.4308 Ops/s | 121.7175 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 44.5421ms | 43.0429ms | 23.2326 Ops/s | 19.0166 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 0.1010s | 22.0845ms | 45.2805 Ops/s | 47.8972 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 43.7874ms | 42.6218ms | 23.4622 Ops/s | 19.0503 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 0.1002s | 21.6044ms | 46.2869 Ops/s | 48.7028 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Refactor
Refactoring code - not a new feature
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.