-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add UniqueFactory for creating random string or int from given range #1128
Conversation
Flow PHP - BenchmarksResults of the benchmarks from this PR are compared with the results from 1.x branch. Extractors+-----------------------+-------------------+------+-----+-----------------+------------------+-----------------+
| benchmark | subject | revs | its | mem_peak | mode | rstdev |
+-----------------------+-------------------+------+-----+-----------------+------------------+-----------------+
| CSVExtractorBench | bench_extract_10k | 1 | 3 | 3.937mb +0.21% | 510.985ms +0.64% | ±2.89% +482.09% |
| JsonExtractorBench | bench_extract_10k | 1 | 3 | 3.969mb +0.21% | 1.058s -1.14% | ±0.80% +16.37% |
| ParquetExtractorBench | bench_extract_10k | 1 | 3 | 28.530mb +0.03% | 422.100ms -3.32% | ±0.57% +11.36% |
| TextExtractorBench | bench_extract_10k | 1 | 3 | 3.696mb +0.22% | 34.270ms +1.36% | ±2.14% +137.70% |
| XmlExtractorBench | bench_extract_10k | 1 | 3 | 3.643mb +0.22% | 430.277ms -0.67% | ±0.29% -67.77% |
+-----------------------+-------------------+------+-----+-----------------+------------------+-----------------+
Transformers+-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
| benchmark | subject | revs | its | mem_peak | mode | rstdev |
+-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
| RenameEntryTransformerBench | bench_transform_10k_rows | 1 | 3 | 116.038mb +0.01% | 59.390ms -0.97% | ±0.37% -78.93% |
+-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
Loaders+--------------------+----------------+------+-----+------------------+-----------------+-----------------+
| benchmark | subject | revs | its | mem_peak | mode | rstdev |
+--------------------+----------------+------+-----+------------------+-----------------+-----------------+
| CSVLoaderBench | bench_load_10k | 1 | 3 | 54.159mb +0.02% | 83.817ms -4.08% | ±3.62% +59.10% |
| JsonLoaderBench | bench_load_10k | 1 | 3 | 106.526mb +0.01% | 51.253ms -2.13% | ±0.54% +2.92% |
| ParquetLoaderBench | bench_load_10k | 1 | 3 | 123.808mb +0.01% | 1.230s -0.32% | ±0.61% -79.90% |
| TextLoaderBench | bench_load_10k | 1 | 3 | 16.953mb +0.05% | 43.739ms -2.65% | ±1.10% +452.90% |
+--------------------+----------------+------+-----+------------------+-----------------+-----------------+
Building Blocks+-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+
| benchmark | subject | revs | its | mem_peak | mode | rstdev |
+-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+
| TypeDetectorBench | bench_type_detector | 1 | 3 | 59.711mb +0.01% | 426.753ms -0.44% | ±0.28% -50.46% |
| TypeDetectorBench | bench_type_detector | 1 | 3 | 14.250mb +0.03% | 85.273ms +0.65% | ±1.05% +64.91% |
| NativeEntryFactoryBench | bench_entry_factory | 1 | 3 | 116.529mb +0.00% | 498.090ms -5.32% | ±0.91% -70.48% |
| NativeEntryFactoryBench | bench_entry_factory | 1 | 3 | 60.007mb +0.01% | 248.786ms -1.16% | ±2.15% +466.60% |
| NativeEntryFactoryBench | bench_entry_factory | 1 | 3 | 14.941mb +0.03% | 54.709ms +0.60% | ±1.50% +859.45% |
| RowsBench | bench_chunk_10_on_10k | 2 | 3 | 86.800mb +0.00% | 3.226ms -6.76% | ±3.19% +27.24% |
| RowsBench | bench_diff_left_1k_on_10k | 2 | 3 | 102.397mb +0.00% | 190.432ms +0.20% | ±1.01% -0.81% |
| RowsBench | bench_diff_right_1k_on_10k | 2 | 3 | 85.117mb +0.00% | 19.307ms +3.46% | ±1.69% +8.97% |
| RowsBench | bench_drop_1k_on_10k | 2 | 3 | 88.040mb +0.00% | 1.578ms -4.87% | ±1.15% -63.34% |
| RowsBench | bench_drop_right_1k_on_10k | 2 | 3 | 88.040mb +0.00% | 1.635ms -9.12% | ±1.52% +60.95% |
| RowsBench | bench_entries_on_10k | 2 | 3 | 85.152mb +0.00% | 2.497ms -4.79% | ±0.43% -50.92% |
| RowsBench | bench_filter_on_10k | 2 | 3 | 85.681mb +0.00% | 14.853ms -0.19% | ±2.46% +121.71% |
| RowsBench | bench_find_on_10k | 2 | 3 | 85.681mb +0.00% | 14.606ms -2.27% | ±1.11% -4.98% |
| RowsBench | bench_find_one_on_10k | 10 | 3 | 83.584mb +0.00% | 1.606μs -11.07% | ±2.89% +12.24% |
| RowsBench | bench_first_on_10k | 10 | 3 | 83.584mb +0.00% | 0.300μs -25.00% | ±0.00% -100.00% |
| RowsBench | bench_flat_map_on_1k | 2 | 3 | 92.935mb +0.00% | 12.098ms -0.69% | ±1.19% +186.43% |
| RowsBench | bench_map_on_10k | 2 | 3 | 122.305mb +0.00% | 60.230ms -2.24% | ±0.35% -79.31% |
| RowsBench | bench_merge_1k_on_10k | 2 | 3 | 86.200mb +0.00% | 1.231ms +0.60% | ±1.89% +6.74% |
| RowsBench | bench_partition_by_on_10k | 2 | 3 | 89.546mb +0.00% | 62.001ms -4.43% | ±1.21% +92.18% |
| RowsBench | bench_remove_on_10k | 2 | 3 | 88.302mb +0.00% | 3.825ms -1.45% | ±0.64% -59.05% |
| RowsBench | bench_sort_asc_on_1k | 2 | 3 | 83.728mb +0.00% | 39.437ms -3.62% | ±0.53% -74.37% |
| RowsBench | bench_sort_by_on_1k | 2 | 3 | 83.729mb +0.00% | 38.985ms -2.36% | ±1.76% +57.43% |
| RowsBench | bench_sort_desc_on_1k | 2 | 3 | 83.728mb +0.00% | 40.850ms -0.26% | ±1.75% +31.82% |
| RowsBench | bench_sort_entries_on_1k | 2 | 3 | 86.026mb +0.00% | 7.344ms -0.05% | ±0.58% -44.39% |
| RowsBench | bench_sort_on_1k | 2 | 3 | 83.584mb +0.00% | 29.063ms -1.90% | ±0.55% -56.78% |
| RowsBench | bench_take_1k_on_10k | 10 | 3 | 83.584mb +0.00% | 13.288μs -1.37% | ±1.95% -43.91% |
| RowsBench | bench_take_right_1k_on_10k | 10 | 3 | 83.584mb +0.00% | 16.010μs +0.44% | ±2.76% -12.17% |
| RowsBench | bench_unique_on_1k | 2 | 3 | 102.399mb +0.00% | 194.288ms +0.24% | ±0.40% -49.17% |
+-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+
|
@@ -163,13 +163,13 @@ public function test_partition_by_partitions_order() : void | |||
function (int $i) : array { | |||
$data = []; | |||
|
|||
$maxItems = \random_int(2, 10); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More I look at this, more I think I would add new function for unique int/string to "hide" object call & improve DX perspective, WDYT?
The PR look good and we could merge it like that (beside few small things @stloyd pointed out) but I think we might be able to still improve even more. Here are my thoughts. Naming - now when I look at it something is missing in the name, like Dependency Injection - I think I would turn Scalar Functions - Additionally (can go to a standalone PR) we could also create Flow scalar functions that could be used during data processing, like: df()
->read(...)
->withEntry('random_string', random_string(length: 16))
->write(to_output())
->run()
df()
->read(...)
->withEntry('random_string', random_string(length: ref('some_int_column'))))
->write(to_output())
->run() This function would implement DSL Functions - as pointed by @stloyd we probably should also add Thoughts? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some changes described above are required to merge this PR :)
Co-authored-by: Joseph Bielawski <[email protected]>
Change Log
Added
Fixed
Changed
Removed
Deprecated
Security
Description
Solves #1124