Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve](agg)support push down min/max on unique table #29242

Merged
merged 8 commits into from
Jan 2, 2024

Conversation

zhangstar333
Copy link
Contributor

Proposed changes

add session variable enable_pushdown_minmax_on_unique = false,
which could control whether push down min/max to unique table.

Issue Number: close #xxx

Further comments

If this is a relatively large or complex change, kick off the discussion at [email protected] by explaining why you chose the solution you did and what alternatives you considered, etc...

@zhangstar333
Copy link
Contributor Author

run buildall

@xiaokang xiaokang added usercase Important user case type label dev/2.0.4 labels Dec 28, 2023
@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Tpch sf100 test result on commit 0fb732e05dc99bc67f7859997f90b420b0b5e007, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4990	4647	4619	4619
q2	364	149	140	140
q3	1468	1247	1159	1159
q4	1131	933	881	881
q5	3149	3137	3144	3137
q6	254	125	126	125
q7	1010	500	485	485
q8	2271	2264	2237	2237
q9	6735	6679	6694	6679
q10	3160	3268	3264	3264
q11	333	214	201	201
q12	350	208	214	208
q13	4162	3440	3428	3428
q14	245	211	210	210
q15	572	522	531	522
q16	442	392	386	386
q17	1048	821	661	661
q18	7015	6713	6793	6713
q19	1629	1636	1633	1633
q20	527	314	327	314
q21	3164	2700	2731	2700
q22	372	308	315	308
Total cold run time: 44391 ms
Total hot run time: 40010 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4664	4567	4564	4564
q2	273	167	172	167
q3	3392	3373	3363	3363
q4	2226	2203	2199	2199
q5	5726	5732	5741	5732
q6	242	120	120	120
q7	2373	1867	1855	1855
q8	3610	3610	3610	3610
q9	9009	8986	8991	8986
q10	3799	3901	3910	3901
q11	489	369	349	349
q12	772	598	600	598
q13	3908	3196	3199	3196
q14	290	249	261	249
q15	563	521	524	521
q16	496	479	461	461
q17	1968	1946	1958	1946
q18	8673	8211	8146	8146
q19	1758	1752	1760	1752
q20	2253	1925	1940	1925
q21	6176	5772	5793	5772
q22	547	468	451	451
Total cold run time: 63207 ms
Total hot run time: 59863 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 47.7 seconds
stream load tsv: 577 seconds loaded 74807831229 Bytes, about 123 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.1 seconds inserted 10000000 Rows, about 355K ops/s
storage size: 17184031990 Bytes

@morrySnow
Copy link
Contributor

could we support agg table

@zhangstar333
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Tpch sf100 test result on commit 635616ee6908404a397fb94aa6e41ea4ed8721fc, data reload: false

run tpch-sf100 query with default conf and session variables
q1	5479	5093	5130	5093
q2	407	165	168	165
q3	1492	1256	1128	1128
q4	1106	859	811	811
q5	3145	3143	3061	3061
q6	233	142	141	141
q7	957	579	537	537
q8	2177	2301	2239	2239
q9	6764	6715	6645	6645
q10	3220	3167	3183	3167
q11	349	227	240	227
q12	390	247	246	246
q13	4409	3699	3652	3652
q14	255	232	219	219
q15	635	564	570	564
q16	453	406	404	404
q17	1050	584	643	584
q18	7088	6735	6800	6735
q19	1638	1487	1600	1487
q20	1278	347	357	347
q21	2932	2483	2535	2483
q22	398	335	324	324
Total cold run time: 45855 ms
Total hot run time: 40259 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	5091	5090	5104	5090
q2	342	231	259	231
q3	3386	3325	3312	3312
q4	2169	1995	1989	1989
q5	5969	5951	5935	5935
q6	236	138	138	138
q7	2399	1956	1964	1956
q8	3544	3661	3712	3661
q9	9051	9010	9008	9008
q10	3882	3940	3947	3940
q11	581	487	512	487
q12	816	675	638	638
q13	3904	3225	3174	3174
q14	318	276	278	276
q15	608	573	564	564
q16	555	503	515	503
q17	2046	1830	1858	1830
q18	8772	8342	8720	8342
q19	1782	1740	1699	1699
q20	2306	2001	1989	1989
q21	5672	5382	5370	5370
q22	551	508	483	483
Total cold run time: 63980 ms
Total hot run time: 60615 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 47.2 seconds
stream load tsv: 579 seconds loaded 74807831229 Bytes, about 123 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.7 seconds inserted 10000000 Rows, about 348K ops/s
storage size: 17183911691 Bytes

@zhangstar333
Copy link
Contributor Author

run buildall

@zhangstar333
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 47.73 seconds
stream load tsv: 578 seconds loaded 74807831229 Bytes, about 123 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 27.7 seconds inserted 10000000 Rows, about 361K ops/s
storage size: 17184062250 Bytes

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Tpch sf100 test result on commit c88a81bce2fc93d43eebb88e5a5dc41c6deffb6d, data reload: false

run tpch-sf100 query with default conf and session variables
q1	5455	5162	5105	5105
q2	395	161	158	158
q3	1473	1174	1197	1174
q4	1111	830	790	790
q5	3127	3174	3171	3171
q6	231	148	137	137
q7	955	588	526	526
q8	2159	2310	2235	2235
q9	6727	6654	6676	6654
q10	3213	3180	3129	3129
q11	348	228	229	228
q12	397	247	249	247
q13	4423	3662	3652	3652
q14	258	217	228	217
q15	625	576	580	576
q16	456	406	413	406
q17	1053	605	558	558
q18	7082	6728	6800	6728
q19	1657	1573	1507	1507
q20	614	361	345	345
q21	2890	2499	2478	2478
q22	404	334	321	321
Total cold run time: 45053 ms
Total hot run time: 40342 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	5092	5042	5120	5042
q2	339	236	247	236
q3	3370	3326	3283	3283
q4	2146	2026	2007	2007
q5	5960	5922	5910	5910
q6	237	135	134	134
q7	2397	1972	1958	1958
q8	3571	3652	3692	3652
q9	9115	9037	9080	9037
q10	3887	3922	3914	3914
q11	585	475	480	475
q12	809	640	645	640
q13	3892	3191	3173	3173
q14	296	260	269	260
q15	608	569	558	558
q16	571	508	521	508
q17	2018	1847	1783	1783
q18	8768	8352	8354	8352
q19	1769	1695	1692	1692
q20	2294	2014	1988	1988
q21	5780	5369	5358	5358
q22	560	506	469	469
Total cold run time: 64064 ms
Total hot run time: 60429 ms

yiguolei
yiguolei previously approved these changes Dec 29, 2023
Copy link
Contributor

@yiguolei yiguolei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Dec 29, 2023
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

@zhangstar333
Copy link
Contributor Author

run buildall

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Dec 29, 2023
@zhangstar333
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Tpch sf100 test result on commit c548ee1501a6ed9f64188a7d8189c6237f0ff9cd, data reload: false

run tpch-sf100 query with default conf and session variables
q1	5513	5207	5162	5162
q2	405	195	158	158
q3	1461	1127	1215	1127
q4	1091	845	804	804
q5	3163	3087	3139	3087
q6	230	151	138	138
q7	989	574	529	529
q8	2182	2288	2236	2236
q9	6722	6682	6650	6650
q10	3184	3150	3138	3138
q11	335	233	215	215
q12	388	241	241	241
q13	4411	3627	3620	3620
q14	257	220	219	219
q15	628	568	587	568
q16	460	407	396	396
q17	1056	519	525	519
q18	7145	6781	6779	6779
q19	1646	1500	1529	1500
q20	551	367	355	355
q21	2918	2474	2492	2474
q22	402	328	323	323
Total cold run time: 45137 ms
Total hot run time: 40238 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	5144	5147	5125	5125
q2	350	265	236	236
q3	3381	3332	3338	3332
q4	2158	1987	1989	1987
q5	5940	5901	5928	5901
q6	236	133	133	133
q7	2393	1930	1931	1930
q8	3565	3668	3666	3666
q9	9008	8930	8962	8930
q10	3878	3908	3905	3905
q11	587	495	491	491
q12	793	650	675	650
q13	3872	3158	3168	3158
q14	309	282	274	274
q15	616	554	561	554
q16	558	512	572	512
q17	2015	1844	1793	1793
q18	8892	8405	8277	8277
q19	1758	1751	1745	1745
q20	2287	2019	1970	1970
q21	5765	5407	5326	5326
q22	563	517	504	504
Total cold run time: 64068 ms
Total hot run time: 60399 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 47.52 seconds
stream load tsv: 564 seconds loaded 74807831229 Bytes, about 126 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.3 seconds inserted 10000000 Rows, about 353K ops/s
storage size: 17187840068 Bytes

@zhangstar333
Copy link
Contributor Author

run buildall

@zhangstar333
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Tpch sf100 test result on commit a2045c0ec3583228d59923de6b118d260f04ed9e, data reload: false

run tpch-sf100 query with default conf and session variables
q1	5445	5148	5151	5148
q2	405	170	157	157
q3	1467	1198	1287	1198
q4	1094	851	775	775
q5	3138	3159	3115	3115
q6	231	141	137	137
q7	982	577	529	529
q8	2141	2232	2238	2232
q9	6704	6618	6666	6618
q10	3202	3107	3111	3107
q11	337	225	210	210
q12	386	243	237	237
q13	4433	3654	3658	3654
q14	255	216	219	216
q15	600	539	547	539
q16	459	394	411	394
q17	1046	613	506	506
q18	7061	6794	6744	6744
q19	1643	1489	1465	1465
q20	602	350	366	350
q21	2932	2416	2528	2416
q22	397	327	321	321
Total cold run time: 44960 ms
Total hot run time: 40068 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	5057	5135	5042	5042
q2	344	253	246	246
q3	3356	3280	3290	3280
q4	2134	2017	1965	1965
q5	5936	5915	5944	5915
q6	231	131	131	131
q7	2400	1894	1963	1894
q8	3585	3658	3656	3656
q9	9050	8994	8966	8966
q10	3871	3998	3922	3922
q11	578	471	483	471
q12	807	646	662	646
q13	3868	3224	3209	3209
q14	295	267	267	267
q15	602	547	541	541
q16	554	523	527	523
q17	2057	1791	1824	1791
q18	8858	8343	8397	8343
q19	1737	1746	1721	1721
q20	2298	2008	1978	1978
q21	5811	5401	5343	5343
q22	557	522	538	522
Total cold run time: 63986 ms
Total hot run time: 60372 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 47.44 seconds
stream load tsv: 574 seconds loaded 74807831229 Bytes, about 124 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 33 seconds loaded 861443392 Bytes, about 24 MB/s
insert into select: 28.2 seconds inserted 10000000 Rows, about 354K ops/s
storage size: 17184130610 Bytes

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 2, 2024
Copy link
Contributor

github-actions bot commented Jan 2, 2024

PR approved by at least one committer and no changes requested.

@yiguolei yiguolei merged commit af39217 into apache:master Jan 2, 2024
26 of 28 checks passed
seawinde pushed a commit to seawinde/doris that referenced this pull request Jan 3, 2024
zhangstar333 added a commit to zhangstar333/incubator-doris that referenced this pull request Jan 4, 2024
yiguolei pushed a commit that referenced this pull request Jan 4, 2024
…29507)

* [improve](agg)support push down min/max on unique table (#29242)

* update
HappenLee pushed a commit to HappenLee/incubator-doris that referenced this pull request Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.0.4-merged dev/3.0.0-merged reviewed usercase Important user case type label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants