Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix "RuntimeError: cannot schedule new futures after interpreter shutdown" #1025

Open
wants to merge 5 commits into
base: 3.x
Choose a base branch
from

Conversation

i-am-darshil
Copy link

Changes

Fixes #985

Summary

This PR addresses a race condition where a job submission coincides with the executor shutting down, leading to unnecessary RuntimeError logs. While the error is harmless, it clutters logs and can be misleading.

Root Cause

  1. A job is scheduled and is about to be submitted to the executor’s thread/process pool.
  2. Before submission completes, the executor begins shutting down.
  3. The shutdown process acquires _shutdown_lock and marks the executor as shut down.
  4. A job submission is attempted post-shutdown, causing a RuntimeError.
Traceback (most recent call last):
  File "C:\Users\gusta\teste\Em andamento\PyControlAPI\backend\nvenv\lib\site-packages\apscheduler\schedulers\base.py", line 988, in _process_jobs
    executor.submit_job(job, run_times)
  File "C:\Users\gusta\teste\Em andamento\PyControlAPI\backend\nvenv\lib\site-packages\apscheduler\executors\base.py", line 71, in submit_job
    self._do_submit_job(job, run_times)
  File "C:\Users\gusta\teste\Em andamento\PyControlAPI\backend\nvenv\lib\site-packages\apscheduler\executors\pool.py", line 28, in _do_submit_job
    f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name)
  File "C:\Users\gusta\AppData\Local\Programs\Python\Python310\lib\concurrent\futures\thread.py", line 163, in submit
    raise RuntimeError('cannot schedule new futures after '
RuntimeError: cannot schedule new futures after interpreter shutdown

Steps to reproduce

#985 (comment)

Checklist

If this is a user-facing code change, like a bugfix or a new feature, please ensure that
you've fulfilled the following conditions (where applicable):

  • [ ✅] You've added tests (in tests/) added which would fail without your patch
  • You've updated the documentation (in docs/, in case of behavior changes or new
    features)
  • [ ✅] You've added a new changelog entry (in docs/versionhistory.rst).

If this is a trivial change, like a typo fix or a code reformatting, then you can ignore
these instructions.

Updating the changelog

If there are no entries after the last release, use **UNRELEASED** as the version.
If, say, your patch fixes issue #999, the entry should look like this:

* Fix big bad boo-boo in the async scheduler (#999 <https://github.com/agronholm/apscheduler/issues/999>_; PR by @yourgithubaccount)

If there's no issue linked, just link to your pull request instead by updating the
changelog after you've created the PR.

@i-am-darshil
Copy link
Author

Hey @agronholm, whenever you get a chance, can you 👀 at this?

@agronholm
Copy link
Owner

To me, it seems like this doesn't address the root cause, but simply sweeps the error message under the carpet. A better approach would be to reorganize the shutdown process in such a way that the scheduler won't even try to schedule jobs after shutdown, yes? I don't know how feasible this is in v3.x (it's already handled on v4.x) but I'd like to see an attempt made first.

@i-am-darshil
Copy link
Author

Hey @agronholm,
I didn't want to make big changes initially since you're already rolling out v4.x. But, agreed with your suggestion. Let me try it out.

Quite curious & hence one question, I see v3.10.4 is quite different branch v3.x. However, the only active branch is 3.x
How are the changes from v3.x propagated to different versions like v3.10.4?

@agronholm
Copy link
Owner

Quite curious & hence one question, I see v3.10.4 is quite different branch v3.x. However, the only active branch is 3.x
How are the changes from v3.x propagated to different versions like v3.10.4?

I just create tags on specific commits on the 3.x branch, there's nothing special to it. Or did I misunderstand your question?

@i-am-darshil
Copy link
Author

Okay, so the shutdown process at apscheduler seems fine. Its a weird race condition that is happening at concurrent/futures/thread.py level.

Let me try to explain what I understand:

  1. A job gets submitted here via a pool
  2. Before the thread gets self._shutdown_lock & _global_shutdown_lock here, lets say the program exits
  3. _python_exit acquires _global_shutdown_lock here and sets _shutdown to true
  4. Next, this sets executor's _shutdown to True.
  5. Now the thread (from point 2) gets the required locks and proceeds to encounter this condition here which leads to RuntimeError('cannot schedule new futures after shutdown')

I am not sure if there is anything we can do at apscheduler level apart from sweeping the error message like done in this PR.

@agronholm, your thoughts?

@i-am-darshil
Copy link
Author

Hey @agronholm, what do you think?

@agronholm
Copy link
Owner

During scheduler shutdown, ThreadPoolExecutor.shutdown(wait=True) should have been called. After that, there shouldn't be any lingering threads around. If this is not happening, we need to check why that is.

@i-am-darshil
Copy link
Author

@agronholm
ThreadPoolExecutor.shutdown(wait=True) gets called.
I ran the code locally, and shutdown is working as expected, and no new jobs are added post shutdown. As mentioned above, its a peculiar race condition, where the job is already submitted to the concurrent.futures.ThreadPoolExecutor but hasn't got the cpu cycle to move forward yet. In the meantime, because of _global_shutdown_lock being set due to program exit, concurrent.futures.ThreadPoolExecutor is marked as shutdown. Then, when the job that was already to-be scheduled gets the cpu-cycle, it sees concurrent.futures.ThreadPoolExecutor as shutdown and raises this RuntimeError.

  1. A job gets submitted here via a pool
  2. Before the thread gets self._shutdown_lock & _global_shutdown_lock here, lets say the program exits
  3. _python_exit acquires _global_shutdown_lock here and sets _shutdown to true
  4. Next, this sets executor's _shutdown to True.
  5. Now the thread (from point 2) gets the required locks and proceeds to encounter this condition here which leads to RuntimeError('cannot schedule new futures after shutdown')

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants