Fix limiter token tracking (again) #432

DrJosh9000 · 2024-11-22T06:49:55Z

What

Fixes a regression in limiter job counting.

Use k8s Informer more effectively to count jobs-in-flight accurately.

Related to #302 (but probably not a fix for it).

Why

In the pre ~v0.16 paradigm, the limiter had two roles: deduplicate jobs, and enforce the max-in-flight limit. This made sense because to deduplicate jobs required tracking the running jobs in a map, and the map could count running jobs. Later, I changed the mechanism for waking goroutines waiting for jobs to finish from a sync.Cond to a channel ("token bucket"), so that waiting for capacity could be cancelled via context.

In v0.19 I split the old limiter into the new limiter and deduper to make it easier to understand. But I didn't entirely finish the job: the deduping map had the effect of preventing multiple-counting tokens, and because the new limiter didn't have a deduping map, it can easily double-count starting and finishing jobs.

To count jobs correctly we need to:

take a token only when a job is started
return a token only when a job is finished

Taking tokens in OnAdd after the cache sync is finished is double-counting, because we're already taking tokens in Handle. Similarly, returning tokens in OnDelete is probably wrong except in the case where the Informer has missed some updates from not-finished to finished. Finally, taking or returning tokens in OnUpdate should only happen when a job changes state between not-finished and finished, not whenever it gets called.

Testing

I manually tested locally on Orbstack with a pipeline that generates 100 trivial jobs, configured with checkout skipped and default MaxInFlight (25).

Time spent running jobs (not including waiting for agent):

v0.18.0: 46s
v0.19.2: 4m57s
v0.19.2-4-g878ddf3: 52s

wolfeidau

👍🏻 🚀

DrJosh9000 enabled auto-merge November 22, 2024 06:50

Fix limiter token tracking (again)

878ddf3

DrJosh9000 force-pushed the fix-limiter-token-tracking-again branch from 8c5b7f2 to 878ddf3 Compare November 22, 2024 07:10

wolfeidau approved these changes Nov 24, 2024

View reviewed changes

DrJosh9000 merged commit eb355cc into main Nov 24, 2024
1 check passed

DrJosh9000 deleted the fix-limiter-token-tracking-again branch November 24, 2024 22:39

DrJosh9000 mentioned this pull request Nov 26, 2024

Refine informer usage in deduper #436

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix limiter token tracking (again) #432

Fix limiter token tracking (again) #432

DrJosh9000 commented Nov 22, 2024 •

edited

Loading

wolfeidau left a comment

Fix limiter token tracking (again) #432

Fix limiter token tracking (again) #432

Conversation

DrJosh9000 commented Nov 22, 2024 • edited Loading

What

Why

Testing

wolfeidau left a comment

Choose a reason for hiding this comment

DrJosh9000 commented Nov 22, 2024 •

edited

Loading