Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature](nereids) in predicate extract non constant expressions #46794

Merged
merged 4 commits into from
Jan 13, 2025

Conversation

yujun777
Copy link
Collaborator

@yujun777 yujun777 commented Jan 10, 2025

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

if an in predicate contains non-literal, backend process it will reduce performance. so we need to extract the non constant from the in predicate.

this pr add an expression rewrite rule InPredicateExtractNonConstant, it will extract all the non-constant out of the in predicate. for example:

k1  in (k2,  k3 + 3,   1, 2, 3 + 3)  => k1 in (1, 2, 3 + 3) or k1 = k2 or k1 = k3 + 1

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Jan 10, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@yujun777 yujun777 marked this pull request as ready for review January 10, 2025 11:50
@yujun777
Copy link
Collaborator Author

run buildall

@yujun777
Copy link
Collaborator Author

run buildall

@yujun777
Copy link
Collaborator Author

run performance

@doris-robot
Copy link

TPC-H: Total hot run time: 32331 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 497c48e6d0f726dd3e6e98abd87fb6b069bf2685, data reload: false

------ Round 1 ----------------------------------
q1	17584	6026	6009	6009
q2	2053	295	167	167
q3	10528	1232	694	694
q4	10213	849	423	423
q5	7616	2127	1973	1973
q6	204	177	144	144
q7	876	738	607	607
q8	9219	1304	1129	1129
q9	5121	4871	4932	4871
q10	6737	2301	1867	1867
q11	481	276	255	255
q12	334	352	223	223
q13	17763	3704	3080	3080
q14	240	243	217	217
q15	564	517	498	498
q16	635	616	572	572
q17	553	844	327	327
q18	7388	6304	6504	6304
q19	2736	948	537	537
q20	304	306	180	180
q21	2804	2165	1947	1947
q22	364	329	307	307
Total cold run time: 104317 ms
Total hot run time: 32331 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6384	6258	6204	6204
q2	251	327	230	230
q3	2242	2645	2314	2314
q4	1364	1773	1322	1322
q5	4312	4703	4798	4703
q6	191	179	142	142
q7	2029	1972	1813	1813
q8	2569	2740	2673	2673
q9	7258	7285	7141	7141
q10	2997	3314	2798	2798
q11	592	510	497	497
q12	682	787	647	647
q13	3352	3842	3196	3196
q14	272	296	267	267
q15	565	512	510	510
q16	642	702	628	628
q17	1194	1735	1232	1232
q18	7649	7519	7250	7250
q19	719	1111	986	986
q20	1949	1955	1812	1812
q21	5460	4916	4762	4762
q22	632	619	554	554
Total cold run time: 53305 ms
Total hot run time: 51681 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 187843 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 497c48e6d0f726dd3e6e98abd87fb6b069bf2685, data reload: false

query1	971	375	377	375
query2	6517	2239	2337	2239
query3	6705	210	212	210
query4	33524	23808	23830	23808
query5	4373	609	477	477
query6	286	186	178	178
query7	4628	484	301	301
query8	285	240	218	218
query9	9556	2665	2651	2651
query10	467	306	255	255
query11	18319	15222	14958	14958
query12	150	110	104	104
query13	1639	505	367	367
query14	10668	7163	6393	6393
query15	220	197	185	185
query16	8028	608	429	429
query17	1585	704	533	533
query18	2073	388	306	306
query19	227	173	150	150
query20	126	112	107	107
query21	208	124	102	102
query22	4493	4218	4311	4218
query23	33878	33112	33273	33112
query24	6598	2281	2195	2195
query25	453	435	379	379
query26	1212	269	155	155
query27	2007	451	324	324
query28	5204	2420	2368	2368
query29	661	536	411	411
query30	230	183	153	153
query31	945	860	769	769
query32	79	60	64	60
query33	508	353	294	294
query34	731	849	509	509
query35	790	792	714	714
query36	999	1024	940	940
query37	131	99	75	75
query38	3986	4034	4044	4034
query39	1475	1412	1416	1412
query40	202	118	105	105
query41	57	56	59	56
query42	123	106	115	106
query43	501	518	482	482
query44	1418	804	798	798
query45	178	177	169	169
query46	850	1027	629	629
query47	1856	1846	1776	1776
query48	376	407	315	315
query49	803	537	389	389
query50	612	641	384	384
query51	6781	7005	6825	6825
query52	106	102	88	88
query53	219	242	189	189
query54	469	477	408	408
query55	79	81	83	81
query56	254	265	242	242
query57	1191	1203	1109	1109
query58	246	232	234	232
query59	2910	3076	3101	3076
query60	288	308	253	253
query61	130	114	125	114
query62	825	771	723	723
query63	226	191	185	185
query64	4012	1016	623	623
query65	3232	3133	3172	3133
query66	1083	411	319	319
query67	15973	15785	15442	15442
query68	8127	737	522	522
query69	462	289	268	268
query70	1202	1125	1111	1111
query71	445	279	252	252
query72	6132	3888	3868	3868
query73	654	739	361	361
query74	10057	8710	8784	8710
query75	4300	3134	2606	2606
query76	3797	1184	764	764
query77	781	365	282	282
query78	9907	10007	9296	9296
query79	2904	784	595	595
query80	625	529	505	505
query81	476	285	238	238
query82	652	151	123	123
query83	181	173	149	149
query84	251	87	76	76
query85	767	353	289	289
query86	360	316	292	292
query87	4348	4589	4401	4401
query88	4127	2128	2125	2125
query89	395	328	282	282
query90	1912	185	186	185
query91	137	142	108	108
query92	70	56	53	53
query93	1008	904	530	530
query94	653	403	288	288
query95	337	265	245	245
query96	476	595	275	275
query97	2794	2908	2767	2767
query98	217	204	194	194
query99	1594	1476	1364	1364
Total cold run time: 290083 ms
Total hot run time: 187843 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.61 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 497c48e6d0f726dd3e6e98abd87fb6b069bf2685, data reload: false

query1	0.04	0.03	0.06
query2	0.07	0.04	0.03
query3	0.24	0.07	0.08
query4	1.61	0.11	0.11
query5	0.40	0.41	0.42
query6	1.16	0.65	0.65
query7	0.02	0.02	0.02
query8	0.04	0.03	0.03
query9	0.56	0.52	0.51
query10	0.57	0.57	0.56
query11	0.15	0.10	0.09
query12	0.14	0.11	0.11
query13	0.61	0.62	0.61
query14	2.85	2.83	2.77
query15	0.89	0.82	0.84
query16	0.39	0.36	0.38
query17	1.05	0.98	0.98
query18	0.24	0.21	0.21
query19	1.96	1.79	1.97
query20	0.01	0.02	0.01
query21	15.35	0.91	0.56
query22	0.76	0.81	0.72
query23	15.27	1.35	0.60
query24	3.05	1.24	1.74
query25	0.15	0.13	0.11
query26	0.41	0.16	0.13
query27	0.08	0.06	0.05
query28	13.80	1.49	1.04
query29	12.57	3.98	3.25
query30	0.25	0.09	0.07
query31	2.82	0.59	0.38
query32	3.22	0.55	0.46
query33	3.13	3.22	3.01
query34	16.71	5.16	4.56
query35	4.53	4.47	4.51
query36	0.63	0.49	0.48
query37	0.09	0.06	0.06
query38	0.04	0.04	0.03
query39	0.03	0.02	0.03
query40	0.16	0.13	0.14
query41	0.08	0.03	0.02
query42	0.04	0.02	0.02
query43	0.03	0.04	0.03
Total cold run time: 106.2 s
Total hot run time: 31.61 s

@yujun777
Copy link
Collaborator Author

run buildall

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 13, 2025
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

@doris-robot
Copy link

TPC-H: Total hot run time: 32356 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 9a9124fe0cfc5500cbc5fbc27ef093da414cde32, data reload: false

------ Round 1 ----------------------------------
q1	17576	6129	6037	6037
q2	2064	295	168	168
q3	10891	1210	765	765
q4	10317	911	464	464
q5	7705	2185	1984	1984
q6	208	179	150	150
q7	918	777	611	611
q8	9228	1333	1161	1161
q9	5094	4804	4843	4804
q10	6746	2260	1832	1832
q11	504	291	259	259
q12	337	372	219	219
q13	17776	3604	3035	3035
q14	244	233	211	211
q15	566	505	490	490
q16	626	612	581	581
q17	576	845	337	337
q18	6835	6397	6179	6179
q19	1717	938	552	552
q20	324	326	192	192
q21	3000	2180	2004	2004
q22	371	339	321	321
Total cold run time: 103623 ms
Total hot run time: 32356 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6248	6945	6236	6236
q2	237	325	232	232
q3	2197	2645	2283	2283
q4	1428	1826	1384	1384
q5	4282	4742	4802	4742
q6	185	177	147	147
q7	2059	1975	1775	1775
q8	2563	2781	2717	2717
q9	7259	7200	7219	7200
q10	3020	3330	2694	2694
q11	573	543	505	505
q12	690	744	599	599
q13	3447	4046	3240	3240
q14	293	318	281	281
q15	562	517	504	504
q16	668	669	651	651
q17	1244	1745	1258	1258
q18	7696	7531	7345	7345
q19	811	1183	1033	1033
q20	1979	2040	1925	1925
q21	5713	5105	5077	5077
q22	629	613	613	613
Total cold run time: 53783 ms
Total hot run time: 52441 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 195278 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 9a9124fe0cfc5500cbc5fbc27ef093da414cde32, data reload: false

query1	1305	957	911	911
query2	6381	2332	2264	2264
query3	10984	4714	4674	4674
query4	33170	23316	23570	23316
query5	4531	623	461	461
query6	289	216	198	198
query7	3991	490	320	320
query8	311	273	243	243
query9	9273	2783	2773	2773
query10	482	307	253	253
query11	18459	15292	15167	15167
query12	158	114	108	108
query13	1736	528	414	414
query14	11841	7462	7273	7273
query15	223	213	181	181
query16	7644	602	497	497
query17	1583	765	622	622
query18	1654	399	321	321
query19	223	198	170	170
query20	120	115	118	115
query21	206	141	102	102
query22	4538	4645	4430	4430
query23	33928	32922	33384	32922
query24	6032	2362	2325	2325
query25	466	459	408	408
query26	712	273	160	160
query27	1951	467	326	326
query28	5265	2522	2483	2483
query29	555	561	418	418
query30	227	189	154	154
query31	967	875	830	830
query32	72	60	97	60
query33	504	351	313	313
query34	764	871	510	510
query35	802	830	756	756
query36	1039	1015	945	945
query37	144	99	80	80
query38	4039	4090	4169	4090
query39	1532	1487	1454	1454
query40	210	119	106	106
query41	52	53	50	50
query42	120	107	114	107
query43	534	523	489	489
query44	1364	834	835	834
query45	181	174	164	164
query46	886	1053	656	656
query47	1896	1936	1831	1831
query48	387	413	318	318
query49	714	512	416	416
query50	667	661	387	387
query51	7062	7153	7012	7012
query52	109	102	89	89
query53	225	252	177	177
query54	497	521	425	425
query55	82	85	83	83
query56	258	256	261	256
query57	1176	1195	1123	1123
query58	251	246	238	238
query59	3211	3330	3162	3162
query60	276	276	256	256
query61	120	118	116	116
query62	854	788	712	712
query63	229	195	198	195
query64	2823	1055	670	670
query65	3328	3352	3267	3267
query66	1004	413	309	309
query67	16217	15836	15374	15374
query68	8322	703	511	511
query69	485	294	256	256
query70	1224	1155	1145	1145
query71	447	284	312	284
query72	6499	3840	3837	3837
query73	678	749	365	365
query74	10612	8830	8854	8830
query75	4579	3126	2640	2640
query76	4101	1165	790	790
query77	787	365	289	289
query78	10013	9970	9342	9342
query79	3797	890	584	584
query80	738	514	453	453
query81	496	270	233	233
query82	624	148	116	116
query83	190	183	156	156
query84	288	90	68	68
query85	773	371	315	315
query86	351	324	350	324
query87	4477	4477	4319	4319
query88	4219	2218	2169	2169
query89	410	320	286	286
query90	1906	189	187	187
query91	146	146	108	108
query92	68	55	52	52
query93	1774	858	521	521
query94	669	383	290	290
query95	330	264	249	249
query96	493	602	286	286
query97	2894	2940	2832	2832
query98	220	200	194	194
query99	1697	1503	1378	1378
Total cold run time: 297141 ms
Total hot run time: 195278 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.44 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 9a9124fe0cfc5500cbc5fbc27ef093da414cde32, data reload: false

query1	0.04	0.03	0.03
query2	0.07	0.04	0.03
query3	0.23	0.07	0.07
query4	1.61	0.11	0.10
query5	0.43	0.41	0.42
query6	1.16	0.63	0.64
query7	0.02	0.01	0.01
query8	0.04	0.03	0.03
query9	0.59	0.50	0.50
query10	0.56	0.57	0.55
query11	0.14	0.10	0.11
query12	0.14	0.11	0.11
query13	0.59	0.60	0.59
query14	2.72	2.75	2.74
query15	0.89	0.83	0.82
query16	0.39	0.38	0.39
query17	1.08	1.01	1.04
query18	0.23	0.20	0.21
query19	1.96	1.87	1.98
query20	0.01	0.01	0.02
query21	15.35	0.93	0.57
query22	0.75	0.72	0.57
query23	15.50	1.38	0.54
query24	3.26	1.49	1.36
query25	0.19	0.24	0.17
query26	0.24	0.13	0.14
query27	0.08	0.05	0.05
query28	14.43	1.46	1.04
query29	12.55	3.99	3.23
query30	0.25	0.09	0.07
query31	2.82	0.61	0.38
query32	3.27	0.58	0.46
query33	3.23	3.02	3.10
query34	16.86	5.10	4.43
query35	4.46	4.44	4.53
query36	0.64	0.48	0.48
query37	0.09	0.07	0.05
query38	0.05	0.04	0.03
query39	0.04	0.02	0.02
query40	0.17	0.14	0.13
query41	0.08	0.03	0.03
query42	0.04	0.03	0.02
query43	0.03	0.03	0.03
Total cold run time: 107.28 s
Total hot run time: 31.44 s

@yujun777
Copy link
Collaborator Author

run external

@starocean999 starocean999 merged commit 565edd9 into apache:master Jan 13, 2025
25 of 26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants