-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Refactor] MemoryMappedTensor #541
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Oct 10, 2023
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 30.8990μs | 19.8620μs | 50.3475 KOps/s | 49.5428 KOps/s | |
test_plain_set_stack_nested | 0.2031ms | 0.1816ms | 5.5068 KOps/s | 5.3938 KOps/s | |
test_plain_set_nested_inplace | 49.1990μs | 23.6753μs | 42.2380 KOps/s | 42.2575 KOps/s | |
test_plain_set_stack_nested_inplace | 0.2499ms | 0.2213ms | 4.5179 KOps/s | 4.5202 KOps/s | |
test_items | 0.1365ms | 3.0734μs | 325.3702 KOps/s | 329.0883 KOps/s | |
test_items_nested | 0.4348ms | 0.3967ms | 2.5211 KOps/s | 2.6157 KOps/s | |
test_items_nested_locked | 0.4206ms | 0.3966ms | 2.5212 KOps/s | 2.7483 KOps/s | |
test_items_nested_leaf | 1.2236ms | 0.2370ms | 4.2191 KOps/s | 4.5363 KOps/s | |
test_items_stack_nested | 1.8610ms | 1.7989ms | 555.8982 Ops/s | 548.8150 Ops/s | |
test_items_stack_nested_leaf | 1.6357ms | 1.6015ms | 624.4084 Ops/s | 605.7387 Ops/s | |
test_items_stack_nested_locked | 1.0711ms | 0.9864ms | 1.0138 KOps/s | 1.0435 KOps/s | |
test_keys | 37.8990μs | 4.6576μs | 214.7023 KOps/s | 204.3353 KOps/s | |
test_keys_nested | 1.5733ms | 0.1737ms | 5.7584 KOps/s | 5.3897 KOps/s | |
test_keys_nested_locked | 0.2180ms | 0.1727ms | 5.7888 KOps/s | 5.7669 KOps/s | |
test_keys_nested_leaf | 1.5901ms | 0.1703ms | 5.8728 KOps/s | 5.8245 KOps/s | |
test_keys_stack_nested | 1.6947ms | 1.6046ms | 623.1981 Ops/s | 597.0795 Ops/s | |
test_keys_stack_nested_leaf | 1.7368ms | 1.6032ms | 623.7371 Ops/s | 597.8020 Ops/s | |
test_keys_stack_nested_locked | 1.0023ms | 0.7785ms | 1.2846 KOps/s | 1.2710 KOps/s | |
test_values | 20.1000μs | 1.3236μs | 755.5340 KOps/s | 767.0575 KOps/s | |
test_values_nested | 88.4990μs | 64.4041μs | 15.5270 KOps/s | 14.8769 KOps/s | |
test_values_nested_locked | 85.7000μs | 64.8049μs | 15.4309 KOps/s | 14.8543 KOps/s | |
test_values_nested_leaf | 77.4990μs | 56.3876μs | 17.7344 KOps/s | 16.7470 KOps/s | |
test_values_stack_nested | 1.4905ms | 1.4025ms | 712.9937 Ops/s | 654.9123 Ops/s | |
test_values_stack_nested_leaf | 1.4241ms | 1.3926ms | 718.0730 Ops/s | 680.9733 Ops/s | |
test_values_stack_nested_locked | 0.7160ms | 0.6306ms | 1.5858 KOps/s | 1.5678 KOps/s | |
test_membership | 54.3000μs | 1.8802μs | 531.8626 KOps/s | 496.1601 KOps/s | |
test_membership_nested | 25.4000μs | 3.7995μs | 263.1918 KOps/s | 260.2904 KOps/s | |
test_membership_nested_leaf | 24.3000μs | 3.7869μs | 264.0692 KOps/s | 259.8244 KOps/s | |
test_membership_stacked_nested | 40.5000μs | 14.9136μs | 67.0530 KOps/s | 61.5273 KOps/s | |
test_membership_stacked_nested_leaf | 36.1000μs | 14.8780μs | 67.2135 KOps/s | 61.2329 KOps/s | |
test_membership_nested_last | 31.1000μs | 7.6781μs | 130.2397 KOps/s | 127.9960 KOps/s | |
test_membership_nested_leaf_last | 28.2000μs | 7.6658μs | 130.4503 KOps/s | 128.1257 KOps/s | |
test_membership_stacked_nested_last | 0.2507ms | 0.2271ms | 4.4026 KOps/s | 4.3793 KOps/s | |
test_membership_stacked_nested_leaf_last | 42.4000μs | 17.1021μs | 58.4723 KOps/s | 54.0353 KOps/s | |
test_nested_getleaf | 38.6000μs | 15.7113μs | 63.6486 KOps/s | 63.7431 KOps/s | |
test_nested_get | 37.8000μs | 14.9417μs | 66.9267 KOps/s | 66.7999 KOps/s | |
test_stacked_getleaf | 0.8172ms | 0.7248ms | 1.3797 KOps/s | 1.3307 KOps/s | |
test_stacked_get | 3.0511ms | 0.7057ms | 1.4171 KOps/s | 1.3817 KOps/s | |
test_nested_getitemleaf | 0.1785ms | 15.7278μs | 63.5819 KOps/s | 63.0654 KOps/s | |
test_nested_getitem | 39.3000μs | 14.9090μs | 67.0738 KOps/s | 66.7099 KOps/s | |
test_stacked_getitemleaf | 0.8108ms | 0.7239ms | 1.3813 KOps/s | 1.3315 KOps/s | |
test_stacked_getitem | 0.7621ms | 0.6932ms | 1.4427 KOps/s | 1.3894 KOps/s | |
test_lock_nested | 56.2092ms | 1.1674ms | 856.5861 Ops/s | 898.2715 Ops/s | |
test_lock_stack_nested | 75.9773ms | 15.4593ms | 64.6858 Ops/s | 64.7886 Ops/s | |
test_unlock_nested | 52.2380ms | 1.1721ms | 853.1495 Ops/s | 857.0762 Ops/s | |
test_unlock_stack_nested | 73.9057ms | 15.9683ms | 62.6241 Ops/s | 62.5045 Ops/s | |
test_flatten_speed | 0.9157ms | 0.8464ms | 1.1814 KOps/s | 1.1403 KOps/s | |
test_unflatten_speed | 1.5477ms | 1.4581ms | 685.8198 Ops/s | 679.7259 Ops/s | |
test_common_ops | 3.0637ms | 0.7620ms | 1.3124 KOps/s | 1.2933 KOps/s | |
test_creation | 58.7990μs | 3.0033μs | 332.9638 KOps/s | 338.7230 KOps/s | |
test_creation_empty | 30.4000μs | 9.4507μs | 105.8122 KOps/s | 103.1488 KOps/s | |
test_creation_nested_1 | 38.2000μs | 14.0648μs | 71.0993 KOps/s | 68.7925 KOps/s | |
test_creation_nested_2 | 77.3990μs | 17.5322μs | 57.0378 KOps/s | 56.2737 KOps/s | |
test_clone | 62.3000μs | 14.7359μs | 67.8616 KOps/s | 67.1602 KOps/s | |
test_getitem[int] | 42.6000μs | 17.6245μs | 56.7392 KOps/s | 56.5610 KOps/s | |
test_getitem[slice_int] | 80.4990μs | 37.9480μs | 26.3518 KOps/s | 26.8056 KOps/s | |
test_getitem[range] | 0.1490ms | 61.5293μs | 16.2524 KOps/s | 16.4511 KOps/s | |
test_getitem[tuple] | 65.1990μs | 31.9996μs | 31.2504 KOps/s | 31.3216 KOps/s | |
test_getitem[list] | 0.3077ms | 56.6766μs | 17.6440 KOps/s | 17.7100 KOps/s | |
test_setitem_dim[int] | 42.9000μs | 32.4730μs | 30.7948 KOps/s | 30.3910 KOps/s | |
test_setitem_dim[slice_int] | 69.7000μs | 58.6333μs | 17.0552 KOps/s | 17.0206 KOps/s | |
test_setitem_dim[range] | 94.9000μs | 77.5515μs | 12.8947 KOps/s | 12.8177 KOps/s | |
test_setitem_dim[tuple] | 64.6000μs | 49.5589μs | 20.1780 KOps/s | 20.4398 KOps/s | |
test_setitem | 97.1990μs | 19.5600μs | 51.1247 KOps/s | 50.2497 KOps/s | |
test_set | 95.0000μs | 18.9636μs | 52.7326 KOps/s | 51.7950 KOps/s | |
test_set_shared | 2.5620ms | 0.1613ms | 6.1990 KOps/s | 6.2644 KOps/s | |
test_update | 0.1102ms | 24.5620μs | 40.7133 KOps/s | 40.2141 KOps/s | |
test_update_nested | 0.1092ms | 34.5724μs | 28.9248 KOps/s | 28.8636 KOps/s | |
test_set_nested | 86.1000μs | 20.6430μs | 48.4427 KOps/s | 47.6956 KOps/s | |
test_set_nested_new | 87.5990μs | 28.1364μs | 35.5411 KOps/s | 34.5961 KOps/s | |
test_select | 0.1231ms | 59.4096μs | 16.8323 KOps/s | 17.0197 KOps/s | |
test_unbind_speed | 0.4211ms | 0.3689ms | 2.7106 KOps/s | 2.5821 KOps/s | |
test_unbind_speed_stack0 | 59.6449ms | 5.2735ms | 189.6275 Ops/s | 180.9227 Ops/s | |
test_unbind_speed_stack1 | 13.7998μs | 0.9610μs | 1.0406 MOps/s | 872.9610 KOps/s | |
test_creation[device0] | 2.0331ms | 0.3531ms | 2.8324 KOps/s | 2.9005 KOps/s | |
test_creation_from_tensor | 57.4586ms | 0.4327ms | 2.3113 KOps/s | 2.6088 KOps/s | |
test_add_one[memmap_tensor0] | 0.1614ms | 30.4083μs | 32.8858 KOps/s | 32.3060 KOps/s | |
test_contiguous[memmap_tensor0] | 30.9000μs | 8.3491μs | 119.7740 KOps/s | 115.7872 KOps/s | |
test_stack[memmap_tensor0] | 62.1990μs | 25.6047μs | 39.0553 KOps/s | 38.7801 KOps/s | |
test_memmaptd_index | 0.3390ms | 0.2737ms | 3.6534 KOps/s | 3.4439 KOps/s | |
test_memmaptd_index_astensor | 0.4017ms | 0.3438ms | 2.9085 KOps/s | 920.5795 Ops/s | |
test_memmaptd_index_op | 0.7338ms | 0.6646ms | 1.5047 KOps/s | 426.7540 Ops/s | |
test_reshape_pytree | 0.1094ms | 31.8367μs | 31.4103 KOps/s | 31.0341 KOps/s | |
test_reshape_td | 54.8000μs | 28.1369μs | 35.5406 KOps/s | 34.6788 KOps/s | |
test_view_pytree | 77.8000μs | 31.2610μs | 31.9887 KOps/s | 31.6496 KOps/s | |
test_view_td | 17.0000μs | 5.5344μs | 180.6868 KOps/s | 182.5779 KOps/s | |
test_unbind_pytree | 96.2990μs | 37.3767μs | 26.7547 KOps/s | 26.8546 KOps/s | |
test_unbind_td | 98.6990μs | 53.4983μs | 18.6922 KOps/s | 18.6957 KOps/s | |
test_split_pytree | 62.9000μs | 36.7675μs | 27.1979 KOps/s | 27.4156 KOps/s | |
test_split_td | 0.1783ms | 96.6925μs | 10.3421 KOps/s | 10.2860 KOps/s | |
test_add_pytree | 78.8000μs | 45.0505μs | 22.1973 KOps/s | 22.2857 KOps/s | |
test_add_td | 81.0000μs | 55.9716μs | 17.8662 KOps/s | 17.7740 KOps/s | |
test_distributed | 54.0000μs | 8.2250μs | 121.5802 KOps/s | 122.6607 KOps/s | |
test_tdmodule | 0.1141ms | 24.4962μs | 40.8227 KOps/s | 40.2242 KOps/s | |
test_tdmodule_dispatch | 0.2154ms | 44.5125μs | 22.4656 KOps/s | 22.6101 KOps/s | |
test_tdseq | 0.1390ms | 26.4022μs | 37.8757 KOps/s | 37.8803 KOps/s | |
test_tdseq_dispatch | 0.5242ms | 46.5668μs | 21.4745 KOps/s | 21.6141 KOps/s | |
test_instantiation_functorch | 1.6548ms | 1.5387ms | 649.8968 Ops/s | 648.0128 Ops/s | |
test_instantiation_td | 1.8319ms | 1.2300ms | 813.0208 Ops/s | 756.6224 Ops/s | |
test_exec_functorch | 0.2683ms | 0.1860ms | 5.3768 KOps/s | 5.4536 KOps/s | |
test_exec_td | 0.2141ms | 0.1773ms | 5.6397 KOps/s | 5.7691 KOps/s | |
test_vmap_mlp_speed[True-True] | 6.8421ms | 0.9894ms | 1.0107 KOps/s | 1.0174 KOps/s | |
test_vmap_mlp_speed[True-False] | 6.3454ms | 0.5304ms | 1.8853 KOps/s | 1.9025 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.1983ms | 0.8495ms | 1.1772 KOps/s | 1.1630 KOps/s | |
test_vmap_mlp_speed[False-False] | 6.2969ms | 0.4366ms | 2.2906 KOps/s | 2.3035 KOps/s |
vmoens
added
enhancement
New feature or request
Refactor
Refactoring code - not a new feature
labels
Oct 25, 2023
vmoens
added a commit
to pytorch/rl
that referenced
this pull request
Nov 14, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
BC-breaking
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Refactor
Refactoring code - not a new feature
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR refactors MemmapTensor to a new Tensor-based class, MemoryMappedTensor.
This should be considerably faster.
MemmapTensor is kept within the library as a separate class which will raise a deprecation warning when created.
This change is bc-breaking in a subtle manner:
when creating a memmap tensordict, the backend will now be MemoryMappedTensor.
When indexed, MemoryMappedTensor will return an object from the same class only if the storage of the indexed object is the same as the original one (where MemmapTensor was always returning a MemmapTensor with a lazy index).
This in turn means that indexing a tensordict with, say, a tensor will now return a tensordict with tensors and not memmap valued tensors. For slices and other indexes that do not modify the storage, nothing will change.