Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPIKE: Better multithreading in dispatcher. #313

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

SPIKE: Better multithreading in dispatcher. #313

wants to merge 1 commit into from

Conversation

jdantonio
Copy link

SPIKE

This is a spike based on a conversation in #310 (specifically this). It is an attempt to replace Ruby's timeout method with something safer and more reliable.

There are two failing tests but I believe these tests are buggy. See #312.

The timeout method from Ruby's standard library spawns a new thread every time it is called. It also attempts to kill the task thread on timeout which can lead to inconsistent system state. Timeout has been called Ruby's Most Dangerous API and is considered by many to be fundamentally broken. At one time the concurrent-ruby gem had its own timeout method, but it was deprecated for many of the same reasons. Using timeout within Dispatcher, then, has two potential problems. First, it creates excessive overhead by spawning timer threads. Second, it may lead to killed threads.

The critical piece in this spike is Concurrent::TimerSet A TimerSet contains a priority queue which sorts all waiting tasks by their expected trigger time. It also contains a thread which repeatedly loops over the TimerSet, triggering jobs at the appropriate time. (The thread loop actually runs on a Concurrent::SingleThreadExecutor which is highly resilient to thread errors and on JRuby maps directly to the corresponding object in java.util.concurrent. So it's also fast.) All Concurrent::ScheduledTask and Concurrent::TimerTask objects in concurrent-ruby run on a global TimerSet, but both support configuration via dependency injection.

In this spike I use a single TimerSet, local to Dispatcher, to monitor timeouts for all tasks run on the worker pool. This should significantly reduce the overhead of thread creation. Each task itself is run on a Concurrent::Future which posts to the same worker pool as before (Future supports dependency injection as well). Responses to the client are handled by an observer object assigned to each Future.

Each task is allowed to run to completion, at which tine the Future updates with the result of the task (success or failure). The Responder (observer) is then notified and it can take action. Meanwhile, the "monitor" (aforementioned TimerSet) tracks the time remaining on each task. When the timer expires the monitor attempts to trigger a timeout failure on the Future. If the Future has already finished, the fail attempt has no effect. If the Future has not finished, the fail call immediately triggers the Responder.

At no point does the system attempt to kill the thread on which the Future is running. A timeout is handled by the monitor. Should the monitor trigger a timeout, the Future will still run to completion (leaving the system in a more predictable state) but the Responder will not be triggered again. The thread will finish silently.

Before this can be merged a need to do a little cleanup, but the core of the functionality is here. Please let me know if this is a direction you would like to go and I will update the PR.

@ryanstout
Copy link
Member

@jdantonio thanks for this. I've been under the weather, should be able to get to this later this week. Thanks so much.

@jdantonio
Copy link
Author

No hurry. Just let me know what I can do to help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants