Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix](job)Fix for Duplicate Scheduling of Tasks #46872

Merged
merged 1 commit into from
Jan 13, 2025

Conversation

CalvinKirs
Copy link
Member

@CalvinKirs CalvinKirs commented Jan 13, 2025

Problem Description

The current scheduling logic calculates the next scheduled time and adds it to the task queue when the condition triggerTime <= windowEndTimeMs is met. However, this can lead to a task being scheduled twice if its triggerTime is exactly equal to windowEndTimeMs:

  • The task is added to the current scheduling window.
  • At the same time, this timestamp becomes the startTime for the next scheduling window, causing the task to be scheduled again.

Changes Made

Updated the condition from triggerTime <= windowEndTimeMs to triggerTime < windowEndTimeMs. This ensures that the scheduling time doesn’t overlap with the window’s end time, preventing duplicate scheduling.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Jan 13, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

### Problem Description
The current scheduling logic calculates the next scheduled time and adds it to the task queue when the condition triggerTime <= windowEndTimeMs is met.
However, this can lead to a task being scheduled twice if its triggerTime is exactly equal to windowEndTimeMs:

- The task is added to the current scheduling window.
- At the same time, this timestamp becomes the startTime for the next scheduling window, causing the task to be scheduled again.
### Changes Made
Updated the condition from triggerTime <= windowEndTimeMs to triggerTime < windowEndTimeMs. This ensures that the scheduling time doesn’t overlap with the window’s end time, preventing duplicate scheduling.
@CalvinKirs CalvinKirs force-pushed the master-job-time-windewos branch from 78c61cc to df717fd Compare January 13, 2025 03:56
@CalvinKirs
Copy link
Member Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32607 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit df717fd6ecd75484c630b0be04e92e7b31d0320c, data reload: false

------ Round 1 ----------------------------------
q1	17597	6548	6110	6110
q2	2050	312	170	170
q3	10405	1228	709	709
q4	10209	851	435	435
q5	7526	2161	1904	1904
q6	201	179	148	148
q7	891	747	605	605
q8	9244	1360	1207	1207
q9	5334	4871	4919	4871
q10	6753	2287	1852	1852
q11	491	265	255	255
q12	331	350	225	225
q13	17766	3653	3097	3097
q14	247	238	218	218
q15	559	505	490	490
q16	631	626	581	581
q17	538	840	319	319
q18	6892	6736	6429	6429
q19	1210	948	546	546
q20	307	325	199	199
q21	2811	2217	1927	1927
q22	361	330	310	310
Total cold run time: 102354 ms
Total hot run time: 32607 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6274	6237	6284	6237
q2	241	331	237	237
q3	2252	2625	2341	2341
q4	1394	1792	1346	1346
q5	4283	4716	4717	4716
q6	187	178	143	143
q7	2106	1976	1866	1866
q8	2634	2763	2672	2672
q9	7307	7273	7143	7143
q10	3029	3333	2895	2895
q11	593	529	498	498
q12	675	774	606	606
q13	3435	3788	3142	3142
q14	310	313	266	266
q15	565	512	501	501
q16	661	693	660	660
q17	1195	1728	1262	1262
q18	7640	7284	7096	7096
q19	754	1060	1021	1021
q20	1927	1966	1811	1811
q21	5432	5105	4914	4914
q22	593	599	570	570
Total cold run time: 53487 ms
Total hot run time: 51943 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 188113 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit df717fd6ecd75484c630b0be04e92e7b31d0320c, data reload: false

query1	968	397	371	371
query2	6519	2430	2421	2421
query3	6717	218	211	211
query4	33765	23729	23463	23463
query5	4371	625	456	456
query6	296	201	196	196
query7	4623	487	309	309
query8	303	254	236	236
query9	9423	2651	2609	2609
query10	481	307	233	233
query11	17877	15307	15041	15041
query12	154	114	105	105
query13	1655	520	376	376
query14	9801	6958	6898	6898
query15	252	187	188	187
query16	8278	587	449	449
query17	1568	737	533	533
query18	2092	420	283	283
query19	219	175	151	151
query20	115	113	109	109
query21	207	124	104	104
query22	4256	4490	4183	4183
query23	34674	33061	32701	32701
query24	6448	2276	2217	2217
query25	480	441	373	373
query26	1202	267	151	151
query27	1994	456	335	335
query28	5323	2417	2408	2408
query29	735	565	447	447
query30	232	191	157	157
query31	966	888	811	811
query32	86	64	57	57
query33	505	345	308	308
query34	740	845	487	487
query35	803	783	735	735
query36	1005	1037	943	943
query37	130	97	71	71
query38	3993	3958	3965	3958
query39	1459	1421	1438	1421
query40	208	143	101	101
query41	51	51	51	51
query42	121	104	102	102
query43	512	531	484	484
query44	1276	795	808	795
query45	175	168	163	163
query46	853	1052	633	633
query47	1847	1879	1790	1790
query48	386	402	316	316
query49	771	480	386	386
query50	614	649	382	382
query51	7014	6894	6845	6845
query52	104	98	92	92
query53	222	255	180	180
query54	461	483	410	410
query55	79	78	81	78
query56	260	269	254	254
query57	1171	1162	1110	1110
query58	262	245	257	245
query59	3062	3041	3019	3019
query60	287	282	265	265
query61	145	144	145	144
query62	835	781	709	709
query63	225	195	194	194
query64	4513	988	634	634
query65	3245	3187	3173	3173
query66	905	412	305	305
query67	15879	15784	15412	15412
query68	8186	706	504	504
query69	459	289	262	262
query70	1224	1126	1142	1126
query71	452	292	266	266
query72	6261	3798	3879	3798
query73	664	757	349	349
query74	10403	9285	8994	8994
query75	4406	3160	2644	2644
query76	4004	1174	754	754
query77	811	372	276	276
query78	10906	9966	9359	9359
query79	2953	765	595	595
query80	686	555	438	438
query81	475	276	234	234
query82	609	151	123	123
query83	203	267	156	156
query84	284	97	75	75
query85	791	341	305	305
query86	359	315	308	308
query87	4415	4270	4283	4270
query88	3335	2178	2142	2142
query89	401	323	298	298
query90	1844	192	194	192
query91	133	133	108	108
query92	65	57	55	55
query93	982	742	522	522
query94	635	407	283	283
query95	328	262	254	254
query96	474	614	276	276
query97	2890	2962	2762	2762
query98	217	205	207	205
query99	1643	1512	1404	1404
Total cold run time: 291696 ms
Total hot run time: 188113 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.62 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit df717fd6ecd75484c630b0be04e92e7b31d0320c, data reload: false

query1	0.03	0.03	0.03
query2	0.07	0.03	0.04
query3	0.24	0.07	0.07
query4	1.60	0.10	0.10
query5	0.44	0.40	0.41
query6	1.16	0.65	0.64
query7	0.02	0.02	0.02
query8	0.03	0.03	0.03
query9	0.57	0.53	0.51
query10	0.56	0.56	0.55
query11	0.14	0.09	0.10
query12	0.14	0.12	0.12
query13	0.61	0.60	0.60
query14	2.80	2.87	2.74
query15	0.91	0.82	0.83
query16	0.38	0.38	0.38
query17	1.02	1.07	1.07
query18	0.23	0.21	0.22
query19	1.85	1.79	2.00
query20	0.02	0.00	0.01
query21	15.35	0.95	0.59
query22	0.74	0.91	0.63
query23	15.51	1.40	0.58
query24	3.19	1.17	1.71
query25	0.24	0.18	0.19
query26	0.25	0.14	0.14
query27	0.08	0.06	0.04
query28	14.26	1.50	1.04
query29	12.67	3.92	3.27
query30	0.25	0.09	0.06
query31	2.80	0.60	0.38
query32	3.22	0.56	0.46
query33	3.04	3.10	3.11
query34	16.89	5.12	4.53
query35	4.53	4.45	4.47
query36	0.65	0.50	0.48
query37	0.10	0.06	0.07
query38	0.05	0.04	0.03
query39	0.04	0.02	0.03
query40	0.17	0.14	0.14
query41	0.08	0.03	0.02
query42	0.04	0.02	0.02
query43	0.03	0.03	0.03
Total cold run time: 107 s
Total hot run time: 31.62 s

@CalvinKirs
Copy link
Member Author

run p0

Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels Jan 13, 2025
Copy link
Contributor

PR approved by anyone and no changes requested.

@CalvinKirs CalvinKirs merged commit 77aadf1 into apache:master Jan 13, 2025
28 of 29 checks passed
@CalvinKirs CalvinKirs deleted the master-job-time-windewos branch January 13, 2025 08:22
github-actions bot pushed a commit that referenced this pull request Jan 13, 2025
### Problem Description
The current scheduling logic calculates the next scheduled time and adds
it to the task queue when the condition triggerTime <= windowEndTimeMs
is met. However, this can lead to a task being scheduled twice if its
triggerTime is exactly equal to windowEndTimeMs:

- The task is added to the current scheduling window.
- At the same time, this timestamp becomes the startTime for the next
scheduling window, causing the task to be scheduled again.

### Changes Made
Updated the condition from triggerTime <= windowEndTimeMs to triggerTime
< windowEndTimeMs. This ensures that the scheduling time doesn’t overlap
with the window’s end time, preventing duplicate scheduling.
github-actions bot pushed a commit that referenced this pull request Jan 13, 2025
### Problem Description
The current scheduling logic calculates the next scheduled time and adds
it to the task queue when the condition triggerTime <= windowEndTimeMs
is met. However, this can lead to a task being scheduled twice if its
triggerTime is exactly equal to windowEndTimeMs:

- The task is added to the current scheduling window.
- At the same time, this timestamp becomes the startTime for the next
scheduling window, causing the task to be scheduled again.

### Changes Made
Updated the condition from triggerTime <= windowEndTimeMs to triggerTime
< windowEndTimeMs. This ensures that the scheduling time doesn’t overlap
with the window’s end time, preventing duplicate scheduling.
CalvinKirs added a commit that referenced this pull request Jan 13, 2025
morningman pushed a commit that referenced this pull request Jan 17, 2025
### Problem Description
The current scheduling logic calculates the next scheduled time and adds
it to the task queue when the condition triggerTime <= windowEndTimeMs
is met. However, this can lead to a task being scheduled twice if its
triggerTime is exactly equal to windowEndTimeMs:

- The task is added to the current scheduling window.
- At the same time, this timestamp becomes the startTime for the next
scheduling window, causing the task to be scheduled again.

### Changes Made
Updated the condition from triggerTime <= windowEndTimeMs to triggerTime
< windowEndTimeMs. This ensures that the scheduling time doesn’t overlap
with the window’s end time, preventing duplicate scheduling.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.1.x dev/3.0.4-merged reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants