Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added possibility to set cache batch size #1034

Merged
merged 3 commits into from
Apr 2, 2024

Conversation

norberttech
Copy link
Member

Change Log

Added

  • cache batch size configuration

Fixed

Changed

  • Replaced CompressingSerializer with NativeSerizer

Removed

Deprecated

Security


Description

I recently noticed drastic performance degradation in the sorting operation, and since sort is set up to move to file based sorting only after reaching specific memory consumption, it wasn't that easy to nice it in the first place.

Pretty much the problem is a missed regression, after I changed the way how extractors/loaders are working (by default one row at time) I missed the fact that it will also hit the caching pipeline which affects sorting.

To resolve that issue new config entry was added, cache batch size which by default is set to 2000. This means that the caching pipeline will process 2000 rows at once, reducing the number of I/O operations.

On top of that I also changed default CompressingSerializer into NativeSerializer which is not doing any compressions that also gives us some noticeable performance boost.

Additionally, during the investigation, I noticed that PSRSimpleCache implementation is not the most optimal way of using PSR16Cache, I will create a dedicated issue for that.

Copy link
Contributor

github-actions bot commented Apr 2, 2024

Flow PHP - Benchmarks

Results of the benchmarks from this PR are compared with the results from 1.x branch.

Extractors
+-----------------------+-------------------+------+-----+------------------+------------------+-----------------+
| benchmark             | subject           | revs | its | mem_peak         | mode             | rstdev          |
+-----------------------+-------------------+------+-----+------------------+------------------+-----------------+
| AvroExtractorBench    | bench_extract_10k | 1    | 3   | 35.287mb -0.01%  | 848.097ms -1.64% | ±0.65% -64.15%  |
| CSVExtractorBench     | bench_extract_10k | 1    | 3   | 5.007mb -0.04%   | 345.182ms +0.21% | ±0.79% +274.66% |
| JsonExtractorBench    | bench_extract_10k | 1    | 3   | 5.161mb -0.04%   | 1.067s -0.37%    | ±0.28% -69.73%  |
| ParquetExtractorBench | bench_extract_10k | 1    | 3   | 135.828mb -0.00% | 935.629ms +2.68% | ±0.94% +424.08% |
| TextExtractorBench    | bench_extract_10k | 1    | 3   | 4.917mb -0.04%   | 35.501ms -2.62%  | ±0.57% -66.99%  |
| XmlExtractorBench     | bench_extract_10k | 1    | 3   | 4.923mb -0.04%   | 433.433ms -0.27% | ±0.39% +20.29%  |
+-----------------------+-------------------+------+-----+------------------+------------------+-----------------+
Transformers
+-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
| benchmark                   | subject                  | revs | its | mem_peak         | mode            | rstdev         |
+-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
| RenameEntryTransformerBench | bench_transform_10k_rows | 1    | 3   | 116.228mb -0.00% | 61.715ms +1.62% | ±0.61% -39.79% |
+-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
Loaders
+--------------------+----------------+------+-----+------------------+------------------+-----------------+
| benchmark          | subject        | revs | its | mem_peak         | mode             | rstdev          |
+--------------------+----------------+------+-----+------------------+------------------+-----------------+
| AvroLoaderBench    | bench_load_10k | 1    | 3   | 96.672mb -0.00%  | 462.610ms +0.61% | ±0.37% +18.73%  |
| CSVLoaderBench     | bench_load_10k | 1    | 3   | 55.148mb -0.00%  | 70.891ms +2.13%  | ±0.44% -76.53%  |
| JsonLoaderBench    | bench_load_10k | 1    | 3   | 107.581mb -0.00% | 52.443ms +2.87%  | ±0.27% +16.95%  |
| ParquetLoaderBench | bench_load_10k | 1    | 3   | 226.996mb -0.00% | 1.441s +2.19%    | ±0.56% +899.26% |
| TextLoaderBench    | bench_load_10k | 1    | 3   | 17.964mb -0.01%  | 40.470ms -1.21%  | ±0.17% -85.51%  |
+--------------------+----------------+------+-----+------------------+------------------+-----------------+
Building Blocks
+-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+
| benchmark               | subject                    | revs | its | mem_peak         | mode             | rstdev          |
+-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+
| RowsBench               | bench_chunk_10_on_10k      | 2    | 3   | 87.050mb +0.00%  | 3.750ms +10.25%  | ±1.00% -59.52%  |
| RowsBench               | bench_diff_left_1k_on_10k  | 2    | 3   | 102.648mb +0.00% | 192.634ms +0.25% | ±0.96% +12.32%  |
| RowsBench               | bench_diff_right_1k_on_10k | 2    | 3   | 85.368mb +0.00%  | 19.266ms -0.87%  | ±1.64% +268.90% |
| RowsBench               | bench_drop_1k_on_10k       | 2    | 3   | 88.290mb +0.00%  | 2.310ms +28.20%  | ±3.29% +127.38% |
| RowsBench               | bench_drop_right_1k_on_10k | 2    | 3   | 88.290mb +0.00%  | 1.999ms +13.79%  | ±1.74% -10.13%  |
| RowsBench               | bench_entries_on_10k       | 2    | 3   | 85.402mb +0.00%  | 2.782ms +3.97%   | ±3.31% +58.02%  |
| RowsBench               | bench_filter_on_10k        | 2    | 3   | 85.931mb +0.00%  | 17.452ms +6.93%  | ±1.32% -0.74%   |
| RowsBench               | bench_find_on_10k          | 2    | 3   | 85.931mb +0.00%  | 17.139ms +3.80%  | ±0.39% -69.32%  |
| RowsBench               | bench_find_one_on_10k      | 10   | 3   | 83.835mb +0.00%  | 2.100μs +10.17%  | ±0.00% -100.00% |
| RowsBench               | bench_first_on_10k         | 10   | 3   | 83.835mb +0.00%  | 0.400μs 0.00%    | ±0.00% 0.00%    |
| RowsBench               | bench_flat_map_on_1k       | 2    | 3   | 93.185mb +0.00%  | 13.097ms +4.13%  | ±1.21% +559.95% |
| RowsBench               | bench_map_on_10k           | 2    | 3   | 122.556mb +0.00% | 64.426ms +3.32%  | ±0.78% -49.91%  |
| RowsBench               | bench_merge_1k_on_10k      | 2    | 3   | 86.451mb +0.00%  | 1.677ms +15.94%  | ±2.59% +30.13%  |
| RowsBench               | bench_partition_by_on_10k  | 2    | 3   | 89.797mb +0.00%  | 68.161ms +6.72%  | ±0.76% +13.39%  |
| RowsBench               | bench_remove_on_10k        | 2    | 3   | 88.552mb +0.00%  | 4.128ms +8.32%   | ±0.29% -87.56%  |
| RowsBench               | bench_sort_asc_on_1k       | 2    | 3   | 83.913mb +0.00%  | 41.564ms +2.86%  | ±2.02% +23.37%  |
| RowsBench               | bench_sort_by_on_1k        | 2    | 3   | 83.914mb +0.00%  | 41.802ms +2.80%  | ±1.68% -0.12%   |
| RowsBench               | bench_sort_desc_on_1k      | 2    | 3   | 83.913mb +0.00%  | 42.898ms +7.59%  | ±1.06% +338.05% |
| RowsBench               | bench_sort_entries_on_1k   | 2    | 3   | 86.276mb +0.00%  | 7.594ms +2.00%   | ±3.21% +123.95% |
| RowsBench               | bench_sort_on_1k           | 2    | 3   | 83.835mb +0.00%  | 30.367ms +4.03%  | ±2.03% +165.67% |
| RowsBench               | bench_take_1k_on_10k       | 10   | 3   | 83.835mb +0.00%  | 14.342μs +7.93%  | ±2.07% +6.11%   |
| RowsBench               | bench_take_right_1k_on_10k | 10   | 3   | 83.835mb +0.00%  | 17.444μs +8.35%  | ±2.12% +39.35%  |
| RowsBench               | bench_unique_on_1k         | 2    | 3   | 102.649mb +0.00% | 195.050ms +0.31% | ±0.63% -38.60%  |
| NativeEntryFactoryBench | bench_entry_factory        | 1    | 3   | 116.727mb +0.00% | 529.626ms +3.89% | ±1.55% +48.03%  |
| NativeEntryFactoryBench | bench_entry_factory        | 1    | 3   | 60.205mb +0.00%  | 259.248ms +0.77% | ±0.31% -20.15%  |
| NativeEntryFactoryBench | bench_entry_factory        | 1    | 3   | 15.140mb +0.00%  | 56.686ms +4.58%  | ±0.14% -83.26%  |
| TypeDetectorBench       | bench_type_detector        | 1    | 3   | 59.960mb +0.00%  | 440.040ms +1.51% | ±0.08% -57.07%  |
| TypeDetectorBench       | bench_type_detector        | 1    | 3   | 14.499mb +0.00%  | 87.494ms -0.22%  | ±0.26% -84.31%  |
+-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+

@norberttech norberttech merged commit 3580718 into flow-php:1.x Apr 2, 2024
17 checks passed
@norberttech norberttech deleted the feature/cache-batch-size branch May 9, 2024 08:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant