Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make several improvements to binary page allocator #299

Merged
merged 1 commit into from
Oct 28, 2024

Conversation

stephenswat
Copy link
Member

This commit makes two main changes to improve the performance of the binary page memory resource.

Firstly, it makes the allocator more eager to unsplit (i.e. merge) pages so that they can be reused for new allocation, which should reduce the memory footprint of when using very large allocations.

Secondly, it sets a minimum size of allocations per superpage, which depends on the size of the superpage. This prevents very small allocations from ending up in massive blocks (increasing fragmentation) and also decreases the total number of pages (increasing performance).

@stephenswat stephenswat added the improvement Improve an existing feature label Oct 20, 2024
@stephenswat stephenswat requested a review from krasznaa October 20, 2024 16:24
@stephenswat
Copy link
Member Author

Performance drops a fair bit for low allocation sizes and increases for larger ones:

Before:

BenchmarkBinaryPage/1                         12.6 ns         12.5 ns     54232577
BenchmarkBinaryPage/2                         12.4 ns         12.4 ns     56375264
BenchmarkBinaryPage/4                         12.8 ns         12.8 ns     54276274
BenchmarkBinaryPage/8                         13.2 ns         13.2 ns     53022963
BenchmarkBinaryPage/16                        13.1 ns         13.1 ns     52296277
BenchmarkBinaryPage/32                        12.8 ns         12.8 ns     54326859
BenchmarkBinaryPage/64                        13.5 ns         13.5 ns     51564964
BenchmarkBinaryPage/128                       13.7 ns         13.7 ns     50968182
BenchmarkBinaryPage/256                       14.1 ns         14.1 ns     49778321
BenchmarkBinaryPage/512                       14.5 ns         14.5 ns     48230900
BenchmarkBinaryPage/1024                      14.9 ns         14.9 ns     46971943
BenchmarkBinaryPage/2048                      14.9 ns         14.9 ns     46973955
BenchmarkBinaryPage/4096                      15.0 ns         15.0 ns     46759439
BenchmarkBinaryPage/8192                      15.0 ns         15.0 ns     46865474
BenchmarkBinaryPage/16384                     14.9 ns         14.9 ns     47017818
BenchmarkBinaryPage/32768                     14.9 ns         14.9 ns     47075989
BenchmarkBinaryPage/65536                     13.4 ns         13.4 ns     46950947
BenchmarkBinaryPage/131072                    12.8 ns         12.8 ns     54544558
BenchmarkBinaryPage/262144                    12.8 ns         12.8 ns     54600546
BenchmarkBinaryPage/524288                    12.8 ns         12.8 ns     54622645
BenchmarkBinaryPage/1048576                   13.2 ns         13.2 ns     53119672
BenchmarkBinaryPage/2097152                   13.6 ns         13.6 ns     51571160
BenchmarkBinaryPage/4194304                   13.6 ns         13.6 ns     51437233
BenchmarkBinaryPage/8388608                   14.0 ns         14.0 ns     50019719
BenchmarkBinaryPage/16777216                  13.9 ns         13.9 ns     50611850
BenchmarkBinaryPage/33554432                  13.6 ns         13.6 ns     52128892
BenchmarkBinaryPage/67108864                  11.4 ns         11.4 ns     50334402
BenchmarkBinaryPage/134217728                 11.1 ns         11.1 ns     63165162
BenchmarkBinaryPage/268435456                 11.5 ns         11.5 ns     60714333
BenchmarkBinaryPage/536870912                 11.5 ns         11.5 ns     60770764
BenchmarkBinaryPage/1073741824                11.9 ns         11.9 ns     58579526
BenchmarkBinaryPage/2147483648                12.2 ns         12.2 ns     55850644
BenchmarkBinaryPage/4294967296                12.4 ns         12.4 ns     54779783

After:

-------------------------------------------------------------------------
Benchmark                               Time             CPU   Iterations
-------------------------------------------------------------------------
BenchmarkBinaryPage/1                 167 ns          167 ns      4104743
BenchmarkBinaryPage/2                 169 ns          169 ns      4009150
BenchmarkBinaryPage/4                 170 ns          170 ns      3971936
BenchmarkBinaryPage/8                 169 ns          169 ns      4094620
BenchmarkBinaryPage/16                171 ns          171 ns      3965085
BenchmarkBinaryPage/32                170 ns          170 ns      4066688
BenchmarkBinaryPage/64                170 ns          170 ns      3978674
BenchmarkBinaryPage/128               176 ns          176 ns      4061772
BenchmarkBinaryPage/256               178 ns          178 ns      4004071
BenchmarkBinaryPage/512               186 ns          186 ns      3645495
BenchmarkBinaryPage/1024              181 ns          181 ns      3873270
BenchmarkBinaryPage/2048              180 ns          180 ns      3828041
BenchmarkBinaryPage/4096              185 ns          185 ns      3845810
BenchmarkBinaryPage/8192              111 ns          110 ns      6328662
BenchmarkBinaryPage/16384            66.1 ns         66.1 ns     10167867
BenchmarkBinaryPage/32768            45.7 ns         45.7 ns     15481878
BenchmarkBinaryPage/65536            33.4 ns         33.4 ns     20748067
BenchmarkBinaryPage/131072           24.8 ns         24.8 ns     28547759
BenchmarkBinaryPage/262144           18.4 ns         18.4 ns     38280071
BenchmarkBinaryPage/524288           14.2 ns         14.2 ns     48774345
BenchmarkBinaryPage/1048576          10.2 ns         10.2 ns     69163315
BenchmarkBinaryPage/2097152          10.4 ns         10.4 ns     67143905
BenchmarkBinaryPage/4194304          10.5 ns         10.5 ns     66245786
BenchmarkBinaryPage/8388608          10.7 ns         10.7 ns     64700648
BenchmarkBinaryPage/16777216         11.1 ns         11.1 ns     64294237
BenchmarkBinaryPage/33554432         11.3 ns         11.3 ns     61304237
BenchmarkBinaryPage/67108864         12.2 ns         12.2 ns     49836666

Note that this benchmark is very antithetical to how normal allocations go, because it constantly allocates a piece of memory and then deallocates it; i.e. no more than 1 allocation ever lives at the same time. Real world performance should be much less impacted.

Copy link
Member

@krasznaa krasznaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have to admit, don't fully understand the code. 😦 But let's try it...

@stephenswat stephenswat force-pushed the feat/bpmr_improvements branch from 1cab9b3 to 45a48bf Compare October 28, 2024 12:37
This commit makes two main changes to improve the performance of the
binary page memory resource.

Firstly, it makes the allocator more eager to unsplit (i.e. merge) pages
so that they can be reused for new allocation, which should reduce the
memory footprint of when using very large allocations.

Secondly, it sets a minimum size of allocations per superpage, which
depends on the size of the superpage. This prevents very small
allocations from ending up in massive blocks (increasing fragmentation)
and also decreases the total number of pages (increasing performance).
@krasznaa krasznaa merged commit f1cfad7 into acts-project:main Oct 28, 2024
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improve an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants