Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug](mark-join) fix wrong result on mark join + other conjunct #29321

Merged
merged 5 commits into from
Jan 4, 2024

Conversation

BiteTheDDDDt
Copy link
Contributor

@BiteTheDDDDt BiteTheDDDDt commented Dec 29, 2023

Proposed changes

fix wrong result on mark join + other conjunct

Not merge to 2.0.

Further comments

If this is a relatively large or complex change, kick off the discussion at [email protected] by explaining why you chose the solution you did and what alternatives you considered, etc...

@BiteTheDDDDt
Copy link
Contributor Author

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Tpch sf100 test result on commit 5c1e86f0143635acaf54fd03b120dc89f3ea05bb, data reload: false

run tpch-sf100 query with default conf and session variables
q1	5490	5132	5182	5132
q2	405	159	159	159
q3	1468	1216	1249	1216
q4	1092	855	821	821
q5	3075	3009	3036	3009
q6	227	143	140	140
q7	978	567	542	542
q8	2151	2206	2236	2206
q9	6740	6707	6652	6652
q10	3228	3133	3197	3133
q11	356	219	218	218
q12	396	243	252	243
q13	4349	3626	3625	3625
q14	262	221	215	215
q15	614	570	567	567
q16	442	396	407	396
q17	1058	548	509	509
q18	7088	6770	6757	6757
q19	1636	1550	1591	1550
q20	608	357	355	355
q21	2876	2517	2456	2456
q22	402	318	333	318
Total cold run time: 44941 ms
Total hot run time: 40219 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	5128	5031	5084	5031
q2	347	255	259	255
q3	3346	3318	3285	3285
q4	2150	1994	2023	1994
q5	5910	5892	5904	5892
q6	237	133	130	130
q7	2362	1937	1945	1937
q8	3542	3647	3675	3647
q9	9008	8998	8971	8971
q10	3882	3930	3930	3930
q11	593	481	481	481
q12	794	657	622	622
q13	3857	3181	3198	3181
q14	290	265	265	265
q15	640	582	547	547
q16	557	515	516	515
q17	2012	1824	1786	1786
q18	8743	8403	8347	8347
q19	1776	1724	1763	1724
q20	2290	1997	1983	1983
q21	5718	5317	5392	5317
q22	535	508	521	508
Total cold run time: 63717 ms
Total hot run time: 60348 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.62% (8606/23504)
Line Coverage: 28.67% (69957/243993)
Region Coverage: 27.67% (36203/130831)
Branch Coverage: 24.38% (18506/75892)
Coverage Report: http://coverage.selectdb-in.cc/coverage/5c1e86f0143635acaf54fd03b120dc89f3ea05bb_5c1e86f0143635acaf54fd03b120dc89f3ea05bb/report/index.html

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 47.27 seconds
stream load tsv: 576 seconds loaded 74807831229 Bytes, about 123 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 27.5 seconds inserted 10000000 Rows, about 363K ops/s
storage size: 17183979352 Bytes

@BiteTheDDDDt
Copy link
Contributor Author

run buildall

Copy link
Contributor

github-actions bot commented Jan 2, 2024

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Tpch sf100 test result on commit dbcb83f024c11bba8c9163f1a46eeea081d9915d, data reload: false

run tpch-sf100 query with default conf and session variables
q1	5415	5126	5126	5126
q2	397	185	159	159
q3	1465	1145	1139	1139
q4	1100	851	862	851
q5	3162	3111	3099	3099
q6	229	136	133	133
q7	972	536	541	536
q8	2186	2219	2210	2210
q9	6672	6695	6642	6642
q10	3177	3136	3156	3136
q11	356	225	220	220
q12	381	227	228	227
q13	4347	3670	3592	3592
q14	250	226	212	212
q15	606	531	538	531
q16	459	400	385	385
q17	1057	525	458	458
q18	7042	6745	6736	6736
q19	1644	1521	1482	1482
q20	560	360	345	345
q21	2892	2441	2506	2441
q22	385	340	344	340
Total cold run time: 44754 ms
Total hot run time: 40000 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	5066	4985	4987	4985
q2	337	266	252	252
q3	3365	3302	3320	3302
q4	2143	2021	2037	2021
q5	5925	5899	5891	5891
q6	236	125	125	125
q7	2395	1928	1928	1928
q8	3529	3681	3657	3657
q9	9038	8951	8990	8951
q10	3880	3915	3889	3889
q11	593	489	488	488
q12	795	643	653	643
q13	3853	3204	3176	3176
q14	304	274	270	270
q15	590	551	552	551
q16	560	516	529	516
q17	2028	1828	1806	1806
q18	8760	8324	8613	8324
q19	1757	1680	1707	1680
q20	2264	1991	1961	1961
q21	5688	5293	5391	5293
q22	610	515	463	463
Total cold run time: 63716 ms
Total hot run time: 60172 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.63% (8612/23513)
Line Coverage: 28.68% (69977/244018)
Region Coverage: 27.66% (36220/130940)
Branch Coverage: 24.36% (18504/75976)
Coverage Report: http://coverage.selectdb-in.cc/coverage/dbcb83f024c11bba8c9163f1a46eeea081d9915d_dbcb83f024c11bba8c9163f1a46eeea081d9915d/report/index.html

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 47.55 seconds
stream load tsv: 565 seconds loaded 74807831229 Bytes, about 126 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.2 seconds inserted 10000000 Rows, about 354K ops/s
storage size: 17183952848 Bytes

@BiteTheDDDDt
Copy link
Contributor Author

run buildall

Copy link
Contributor

github-actions bot commented Jan 2, 2024

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.64% (8615/23513)
Line Coverage: 28.69% (70004/244024)
Region Coverage: 27.68% (36238/130941)
Branch Coverage: 24.37% (18513/75976)
Coverage Report: http://coverage.selectdb-in.cc/coverage/d4bf96defe9a26570cce329bbe5d965341322716_d4bf96defe9a26570cce329bbe5d965341322716/report/index.html

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Tpch sf100 test result on commit d4bf96defe9a26570cce329bbe5d965341322716, data reload: false

run tpch-sf100 query with default conf and session variables
q1	5523	5169	5171	5169
q2	416	156	198	156
q3	1446	1237	1250	1237
q4	1109	827	840	827
q5	3164	3133	3109	3109
q6	230	144	134	134
q7	990	573	501	501
q8	2174	2277	2299	2277
q9	6744	6688	6695	6688
q10	3169	3078	3140	3078
q11	347	219	234	219
q12	394	240	233	233
q13	4421	3669	3614	3614
q14	251	222	220	220
q15	606	545	545	545
q16	474	405	410	405
q17	1044	502	525	502
q18	7098	6802	6923	6802
q19	1656	1565	1557	1557
q20	557	362	335	335
q21	2968	2549	2501	2501
q22	384	325	344	325
Total cold run time: 45165 ms
Total hot run time: 40434 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	5260	5113	5116	5113
q2	343	264	252	252
q3	3383	3323	3298	3298
q4	2171	2012	2049	2012
q5	5944	5958	5913	5913
q6	234	133	127	127
q7	2451	1885	1949	1885
q8	3576	3664	3674	3664
q9	9108	8975	9045	8975
q10	3882	3907	3945	3907
q11	577	492	511	492
q12	811	669	647	647
q13	3878	3185	3220	3185
q14	309	274	259	259
q15	594	544	547	544
q16	535	524	520	520
q17	2048	1839	1768	1768
q18	8778	8310	8310	8310
q19	1781	1740	1717	1717
q20	2250	2012	1979	1979
q21	5807	5369	5500	5369
q22	551	504	504	504
Total cold run time: 64271 ms
Total hot run time: 60440 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 46.84 seconds
stream load tsv: 562 seconds loaded 74807831229 Bytes, about 126 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 27.4 seconds inserted 10000000 Rows, about 364K ops/s
storage size: 17183446410 Bytes

Copy link
Contributor

@yiguolei yiguolei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add regression test please

Copy link
Contributor

@yiguolei yiguolei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does pipelinex have this problem?

@BiteTheDDDDt
Copy link
Contributor Author

run buildall

@BiteTheDDDDt
Copy link
Contributor Author

Does pipelinex have this problem?
it is common code

Copy link
Contributor

github-actions bot commented Jan 3, 2024

clang-tidy review says "All clean, LGTM! 👍"

@BiteTheDDDDt
Copy link
Contributor Author

run buildall

Copy link
Contributor

github-actions bot commented Jan 3, 2024

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.61% (8613/23524)
Line Coverage: 28.66% (69983/244172)
Region Coverage: 27.64% (36221/131043)
Branch Coverage: 24.33% (18498/76024)
Coverage Report: http://coverage.selectdb-in.cc/coverage/dfb1cd9961dabe9237a10f18c7fc87507f127665_dfb1cd9961dabe9237a10f18c7fc87507f127665/report/index.html

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Tpch sf100 test result on commit dfb1cd9961dabe9237a10f18c7fc87507f127665, data reload: false

run tpch-sf100 query with default conf and session variables
q1	5403	5171	5167	5167
q2	409	168	155	155
q3	1443	1227	1238	1227
q4	1089	788	771	771
q5	3151	3165	3131	3131
q6	230	139	130	130
q7	990	571	523	523
q8	2155	2201	2235	2201
q9	6696	6649	6631	6631
q10	3145	3188	3121	3121
q11	338	232	213	213
q12	383	231	230	230
q13	4399	3660	3605	3605
q14	255	215	217	215
q15	603	548	548	548
q16	472	441	405	405
q17	1042	566	533	533
q18	7113	6720	6762	6720
q19	1648	1529	1433	1433
q20	591	354	329	329
q21	2892	2549	2490	2490
q22	389	326	321	321
Total cold run time: 44836 ms
Total hot run time: 40099 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	5148	5108	5073	5073
q2	343	267	253	253
q3	3347	3312	3298	3298
q4	2173	1982	1989	1982
q5	5946	5937	5896	5896
q6	228	126	124	124
q7	2370	1997	1911	1911
q8	3567	3666	3716	3666
q9	8990	9010	8969	8969
q10	3868	3922	3914	3914
q11	588	492	495	492
q12	786	624	669	624
q13	3872	3209	3155	3155
q14	302	257	269	257
q15	619	551	539	539
q16	556	505	512	505
q17	2052	1821	1810	1810
q18	8799	8383	8422	8383
q19	1750	1696	1714	1696
q20	2274	2013	1993	1993
q21	5742	5355	5370	5355
q22	588	518	476	476
Total cold run time: 63908 ms
Total hot run time: 60371 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 47.08 seconds
stream load tsv: 564 seconds loaded 74807831229 Bytes, about 126 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.0 seconds inserted 10000000 Rows, about 357K ops/s
storage size: 17183852959 Bytes

@BiteTheDDDDt BiteTheDDDDt force-pushed the fix_1229 branch 2 times, most recently from db545ec to dfb1cd9 Compare January 3, 2024 13:41
Copy link
Contributor

github-actions bot commented Jan 3, 2024

clang-tidy review says "All clean, LGTM! 👍"

1 similar comment
Copy link
Contributor

github-actions bot commented Jan 3, 2024

clang-tidy review says "All clean, LGTM! 👍"

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 4, 2024
Copy link
Contributor

github-actions bot commented Jan 4, 2024

PR approved by at least one committer and no changes requested.

Copy link
Contributor

github-actions bot commented Jan 4, 2024

PR approved by anyone and no changes requested.

Copy link
Member

@mrhhsg mrhhsg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@BiteTheDDDDt BiteTheDDDDt merged commit d8a08da into apache:master Jan 4, 2024
41 of 42 checks passed
@wm1581066 wm1581066 added not-merge/2.0 do not merge into 2.0 branch and removed dev/2.1.0 labels Jan 5, 2024
HappenLee pushed a commit to HappenLee/incubator-doris that referenced this pull request Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/3.0.0-merged not-merge/2.0 do not merge into 2.0 branch p0_w reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants