Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

retryDeleteJob causes restart job to be unscheduled #3968

Open
GhangZh opened this issue Jan 14, 2025 · 1 comment
Open

retryDeleteJob causes restart job to be unscheduled #3968

GhangZh opened this issue Jan 14, 2025 · 1 comment
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@GhangZh
Copy link

GhangZh commented Jan 14, 2025

Description

image

Steps to reproduce the issue

  1. first delete Podgroup and delete Pod triggered retryDeleteJob called a few times and was penalized with a delay of a few seconds
  2. at this point the deleteJob was called directly out of the queue, removing the job from the jobcache
  3. At this point, the job retry or re-create a job with the same UID.
  4. Then after a few seconds the old job is queued up from the ratelimit queue, and at this time to determine the JobTerminated = true, it will be deleted from the job cache again!

Then the job will always exist and never be scheduled.

Describe the results you received and expected

Expect not to have to delete the cache repeatedly for this scenario

What version of Volcano are you using?

volcano 1.9

Any other relevant information

No response

@GhangZh GhangZh added the kind/bug Categorizes issue or PR as related to a bug. label Jan 14, 2025
@aryasoni98
Copy link

@GhangZh I would love to work on this issue. Could you please assign it to me?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

2 participants