Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improvement](statistics)Use min row count of all replicas as tablet/table row count. (#41894) #41979

Merged
merged 1 commit into from
Oct 16, 2024

Conversation

Jibing-Li
Copy link
Contributor

backport: #41894

…table row count. (apache#41894)

Use min row count of all replicas with same version as tablet/table row
count. Because replica with the least row count means it perform more
compaction operation than the others. Use it as tablet row count is more
accurate.
Meanwhile, use min row count as tablet row count while choosing tablets
during sample analyze.
@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@Jibing-Li Jibing-Li marked this pull request as ready for review October 16, 2024 11:38
@Jibing-Li
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 48800 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit e5f90d93f3052acd311983f740e6b52dd5905f9a, data reload: false

------ Round 1 ----------------------------------
q1	17677	4432	4321	4321
q2	2052	154	142	142
q3	10461	1899	1918	1899
q4	10306	1237	1305	1237
q5	8784	3857	3907	3857
q6	238	124	147	124
q7	2010	1585	1595	1585
q8	9259	2722	2690	2690
q9	10155	9806	9743	9743
q10	8664	3546	3489	3489
q11	415	241	255	241
q12	464	307	301	301
q13	18364	3936	4036	3936
q14	352	325	313	313
q15	511	453	454	453
q16	550	449	465	449
q17	1121	950	942	942
q18	7261	6798	6880	6798
q19	1683	1559	1540	1540
q20	546	317	284	284
q21	4439	4144	4062	4062
q22	489	405	394	394
Total cold run time: 115801 ms
Total hot run time: 48800 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4307	4327	4305	4305
q2	321	228	221	221
q3	4153	4115	4156	4115
q4	2757	2734	2727	2727
q5	7170	7106	7046	7046
q6	233	116	119	116
q7	3215	2862	2841	2841
q8	4341	4400	4481	4400
q9	13597	13599	13561	13561
q10	4239	4237	4228	4228
q11	755	684	682	682
q12	1032	855	845	845
q13	7214	3711	3724	3711
q14	471	427	429	427
q15	492	459	449	449
q16	630	580	583	580
q17	3797	3847	3799	3799
q18	8810	8719	8757	8719
q19	1708	1639	1655	1639
q20	2364	2167	2121	2121
q21	8507	8533	8392	8392
q22	1014	927	931	927
Total cold run time: 81127 ms
Total hot run time: 75851 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 212101 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit e5f90d93f3052acd311983f740e6b52dd5905f9a, data reload: false

query1	942	394	417	394
query2	6530	2291	2157	2157
query3	6918	202	199	199
query4	23367	21978	21575	21575
query5	19753	6565	6527	6527
query6	283	228	244	228
query7	4332	300	307	300
query8	257	266	244	244
query9	3111	2666	2605	2605
query10	463	304	306	304
query11	15890	14914	14937	14914
query12	128	75	74	74
query13	1046	441	428	428
query14	17297	13336	13603	13336
query15	386	219	228	219
query16	6488	280	267	267
query17	1770	940	917	917
query18	881	315	310	310
query19	220	150	142	142
query20	103	91	93	91
query21	190	96	96	96
query22	5139	4934	5003	4934
query23	34405	33582	33587	33582
query24	7792	6380	6375	6375
query25	545	438	431	431
query26	1268	164	161	161
query27	2412	297	292	292
query28	6150	2236	2218	2218
query29	2834	2796	2782	2782
query30	236	168	166	166
query31	949	735	756	735
query32	72	60	59	59
query33	449	264	266	264
query34	859	478	475	475
query35	1092	937	960	937
query36	1316	1084	1251	1084
query37	172	61	60	60
query38	3061	2931	2879	2879
query39	1389	1340	1341	1340
query40	305	94	93	93
query41	38	38	37	37
query42	87	94	84	84
query43	611	704	589	589
query44	1155	718	723	718
query45	243	234	230	230
query46	1238	959	969	959
query47	2050	1900	1753	1753
query48	499	401	427	401
query49	662	378	372	372
query50	880	592	636	592
query51	4828	4689	4845	4689
query52	87	90	100	90
query53	241	200	201	200
query54	2701	2512	2494	2494
query55	91	82	85	82
query56	232	208	202	202
query57	1258	1323	1112	1112
query58	217	198	216	198
query59	3807	3416	3189	3189
query60	243	215	203	203
query61	98	92	95	92
query62	823	499	487	487
query63	204	179	179	179
query64	3469	1582	1349	1349
query65	3651	3580	3584	3580
query66	774	396	386	386
query67	15746	15436	15622	15436
query68	9240	646	643	643
query69	493	254	270	254
query70	1634	1334	1375	1334
query71	416	288	304	288
query72	6901	4793	4710	4710
query73	747	322	316	316
query74	6373	5752	5805	5752
query75	5113	3714	3626	3626
query76	5042	1166	1185	1166
query77	726	251	268	251
query78	12456	11440	27899	11440
query79	3851	613	627	613
query80	1121	390	389	389
query81	494	242	236	236
query82	255	95	93	93
query83	187	132	126	126
query84	252	69	68	68
query85	914	318	318	318
query86	336	304	291	291
query87	3195	3036	2996	2996
query88	3659	2278	2274	2274
query89	349	288	301	288
query90	1850	207	203	203
query91	168	128	124	124
query92	57	52	51	51
query93	931	564	520	520
query94	710	204	207	204
query95	1982	1973	2061	1973
query96	626	316	317	316
query97	6571	6351	6376	6351
query98	224	224	202	202
query99	2565	860	788	788
Total cold run time: 306171 ms
Total hot run time: 212101 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.8 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit e5f90d93f3052acd311983f740e6b52dd5905f9a, data reload: false

query1	0.03	0.02	0.02
query2	0.08	0.03	0.02
query3	0.24	0.05	0.05
query4	1.78	0.06	0.07
query5	0.54	0.53	0.52
query6	1.27	0.62	0.61
query7	0.01	0.01	0.01
query8	0.03	0.02	0.02
query9	0.51	0.48	0.47
query10	0.53	0.52	0.52
query11	0.12	0.08	0.09
query12	0.11	0.09	0.10
query13	0.62	0.61	0.60
query14	0.80	0.78	0.78
query15	0.77	0.78	0.77
query16	0.38	0.38	0.36
query17	1.01	1.03	1.02
query18	0.21	0.23	0.25
query19	1.92	1.78	1.79
query20	0.02	0.01	0.01
query21	15.46	0.58	0.54
query22	2.38	2.53	1.96
query23	17.44	1.05	0.86
query24	6.83	0.89	1.59
query25	0.40	0.11	0.05
query26	0.72	0.14	0.15
query27	0.05	0.04	0.03
query28	5.88	0.71	0.74
query29	12.73	2.31	2.38
query30	0.57	0.58	0.53
query31	2.82	0.38	0.37
query32	3.40	0.50	0.49
query33	3.10	3.06	3.06
query34	15.25	4.79	4.77
query35	4.84	4.83	4.84
query36	1.06	1.02	1.00
query37	0.06	0.05	0.04
query38	0.04	0.02	0.02
query39	0.02	0.01	0.01
query40	0.16	0.14	0.15
query41	0.07	0.02	0.01
query42	0.02	0.02	0.01
query43	0.02	0.01	0.02
Total cold run time: 104.3 s
Total hot run time: 30.8 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit e5f90d93f3052acd311983f740e6b52dd5905f9a with default session variables
Stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
Stream load orc:          58 seconds loaded 1101869774 Bytes, about 18 MB/s
Stream load parquet:      31 seconds loaded 861443392 Bytes, about 26 MB/s
Insert into select:       21.7 seconds inserted 10000000 Rows, about 460K ops/s

@Jibing-Li Jibing-Li merged commit 1976b85 into apache:branch-2.0 Oct 16, 2024
22 of 24 checks passed
@Jibing-Li Jibing-Li deleted the minrow2.0 branch October 18, 2024 08:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants