Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

branch-3.0: [fix](cloud) shorten cache lock held time and add metrics #47472 #47494

Open
wants to merge 1 commit into
base: branch-3.0
Choose a base branch
from

Conversation

github-actions[bot]
Copy link
Contributor

@github-actions github-actions bot commented Feb 2, 2025

Cherry-picked from #47472

when update bvar metrics, we held block lock in the critical context of
cache lock, make the later lock held too long and affect other cache
logic. we use unsafe method to update the bvar to boost performance.

some key metrics of lock and other meaningful metrics are also added for
better monitoring cache time costs.
@github-actions github-actions bot requested a review from dataroaring as a code owner February 2, 2025 16:41
@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring closed this Feb 2, 2025
@dataroaring dataroaring reopened this Feb 2, 2025
@hello-stephen
Copy link
Contributor

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 41015 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 0c81a99453bfe9de5ac717bd72dd978df7620bf4, data reload: false

------ Round 1 ----------------------------------
q1	17587	7669	7255	7255
q2	2064	181	175	175
q3	10628	1102	1266	1102
q4	10487	758	872	758
q5	7765	2878	2911	2878
q6	239	149	146	146
q7	1001	621	618	618
q8	9377	1940	2082	1940
q9	6672	6398	6376	6376
q10	6990	2337	2340	2337
q11	470	275	272	272
q12	410	215	217	215
q13	17788	2985	3030	2985
q14	241	207	213	207
q15	558	520	515	515
q16	694	587	597	587
q17	1000	563	636	563
q18	7324	6604	6804	6604
q19	1395	1032	1106	1032
q20	498	208	209	208
q21	4000	3294	3258	3258
q22	1083	984	991	984
Total cold run time: 108271 ms
Total hot run time: 41015 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7674	7198	7223	7198
q2	328	239	231	231
q3	3019	2998	2959	2959
q4	2062	1853	1793	1793
q5	5713	5749	5714	5714
q6	228	145	146	145
q7	2205	1819	1839	1819
q8	3371	3591	3559	3559
q9	8848	8866	8921	8866
q10	3630	3621	3552	3552
q11	604	500	507	500
q12	818	600	618	600
q13	11079	3195	3176	3176
q14	315	285	275	275
q15	568	536	529	529
q16	703	646	655	646
q17	1873	1659	1590	1590
q18	8330	7881	7486	7486
q19	1687	1624	1714	1624
q20	2103	1853	1879	1853
q21	5587	5307	5356	5307
q22	1149	1049	1068	1049
Total cold run time: 71894 ms
Total hot run time: 60471 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 197902 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 0c81a99453bfe9de5ac717bd72dd978df7620bf4, data reload: false

query1	1317	962	962	962
query2	6246	2108	2092	2092
query3	10845	4496	4292	4292
query4	66573	29459	23505	23505
query5	5016	459	443	443
query6	403	191	185	185
query7	5660	307	314	307
query8	314	227	227	227
query9	9440	2661	2648	2648
query10	459	294	256	256
query11	17782	15237	15862	15237
query12	162	103	104	103
query13	1543	454	422	422
query14	10891	7097	7174	7097
query15	196	180	180	180
query16	7120	518	518	518
query17	1099	588	606	588
query18	2001	339	321	321
query19	225	164	164	164
query20	114	110	120	110
query21	203	106	107	106
query22	4929	4607	4674	4607
query23	35240	34007	34334	34007
query24	6126	2866	2881	2866
query25	548	419	436	419
query26	654	177	181	177
query27	1842	370	359	359
query28	4507	2456	2458	2456
query29	726	484	430	430
query30	246	166	159	159
query31	991	829	833	829
query32	67	52	55	52
query33	393	288	291	288
query34	902	509	528	509
query35	846	747	738	738
query36	1085	975	983	975
query37	123	73	77	73
query38	4112	3991	4065	3991
query39	1501	1459	1485	1459
query40	210	96	95	95
query41	51	45	49	45
query42	108	101	99	99
query43	544	503	486	486
query44	1158	813	815	813
query45	186	168	170	168
query46	1135	719	733	719
query47	2038	1958	1947	1947
query48	468	390	374	374
query49	740	396	379	379
query50	857	421	424	421
query51	7341	7726	7003	7003
query52	102	86	92	86
query53	261	177	178	177
query54	557	452	463	452
query55	74	75	73	73
query56	265	233	242	233
query57	1211	1145	1105	1105
query58	215	220	206	206
query59	3259	3104	3151	3104
query60	276	259	240	240
query61	114	107	110	107
query62	831	715	726	715
query63	218	203	189	189
query64	1355	661	650	650
query65	3295	3186	3183	3183
query66	629	301	304	301
query67	15952	15835	15720	15720
query68	4441	557	582	557
query69	416	265	274	265
query70	1184	1141	1150	1141
query71	333	256	248	248
query72	6427	3732	4012	3732
query73	749	342	341	341
query74	10074	9405	9087	9087
query75	3345	2675	2639	2639
query76	1863	1009	1056	1009
query77	486	271	265	265
query78	10531	9652	9598	9598
query79	1200	595	582	582
query80	825	426	417	417
query81	505	241	240	240
query82	1310	115	116	115
query83	172	140	142	140
query84	279	84	78	78
query85	886	317	298	298
query86	337	300	281	281
query87	4668	4386	4391	4386
query88	3525	2412	2375	2375
query89	415	292	283	283
query90	1868	191	187	187
query91	176	147	146	146
query92	68	49	48	48
query93	1331	550	543	543
query94	737	301	290	290
query95	358	257	263	257
query96	612	283	285	283
query97	3297	3232	3198	3198
query98	217	208	198	198
query99	1696	1428	1467	1428
Total cold run time: 320228 ms
Total hot run time: 197902 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.84 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 0c81a99453bfe9de5ac717bd72dd978df7620bf4, data reload: false

query1	0.03	0.03	0.03
query2	0.07	0.03	0.04
query3	0.24	0.07	0.06
query4	1.63	0.10	0.10
query5	0.55	0.52	0.51
query6	1.15	0.72	0.73
query7	0.02	0.02	0.02
query8	0.03	0.03	0.03
query9	0.55	0.50	0.49
query10	0.56	0.55	0.57
query11	0.14	0.09	0.10
query12	0.14	0.14	0.11
query13	0.61	0.59	0.59
query14	2.82	2.81	2.74
query15	0.90	0.81	0.82
query16	0.37	0.39	0.38
query17	1.06	1.04	1.05
query18	0.25	0.22	0.22
query19	1.88	1.79	2.02
query20	0.01	0.01	0.01
query21	15.36	0.58	0.57
query22	3.25	2.06	1.80
query23	16.91	0.90	0.97
query24	3.51	1.64	1.06
query25	0.16	0.14	0.14
query26	0.48	0.15	0.14
query27	0.05	0.04	0.05
query28	9.95	1.11	1.06
query29	12.61	3.20	3.18
query30	0.25	0.06	0.06
query31	2.86	0.38	0.40
query32	3.27	0.46	0.46
query33	3.00	2.99	3.02
query34	16.90	4.45	4.52
query35	4.56	4.50	4.50
query36	0.66	0.49	0.52
query37	0.09	0.06	0.06
query38	0.05	0.04	0.03
query39	0.03	0.03	0.02
query40	0.16	0.12	0.12
query41	0.07	0.03	0.02
query42	0.03	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 107.26 s
Total hot run time: 32.84 s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants