-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: ResponderDispatchService.DisposeAsync() can deadlock #305
Comments
Also the |
Quite the peculiar set of issues, that's for sure. I've been hoping to rewrite mass swaths of Remora's codebase[1][2], so I suppose I'll definitely tack this onto the list; rewriting dispatch would certainly be a worthwhile investment based on my investigations[3] |
This hacky reflection workaround was able to solve our issue: tgstation/tgstation-server#1509 |
The dispatch service used its internal cancellation token source incorrectly, leading to deadlocking behaviour and lost events on shutdown. This change removes the usage from all but the intended one (to signal responders when they should cancel) and moves the stop responsibility to the data channels instead. Fixes #305.
Interesting - thank you for the in-depth analysis. I've pushed a suggested fix to the |
The dispatch service used its internal cancellation token source incorrectly, leading to deadlocking behaviour and lost events on shutdown. This change removes the usage from all but the intended one (to signal responders when they should cancel) and moves the stop responsibility to the data channels instead. Fixes Remora#305.
I've integrated this in my software and haven't heard the same bug report, so it looks to be working |
Excellent! |
Description
The
_finalizer
Task
relies on the_dispatcher
Task
closing the channel to be able to complete. But it's possible for this to never happen.The channel completion call relies on several throws possible from the CancellationToken not happening.
Remora.Discord/Backend/Remora.Discord.Gateway/Services/ResponderDispatchService.cs
Line 143 in 1f248a4
Remora.Discord/Backend/Remora.Discord.Gateway/Services/ResponderDispatchService.cs
Line 151 in 1f248a4
Remora.Discord/Backend/Remora.Discord.Gateway/Services/ResponderDispatchService.cs
Line 165 in 1f248a4
The deadlock is usually avoided because the
_finalizer
Task
also usually throws out before it gets to that pointRemora.Discord/Backend/Remora.Discord.Gateway/Services/ResponderDispatchService.cs
Lines 188 to 191 in 1f248a4
Remora.Discord/Backend/Remora.Discord.Gateway/Services/ResponderDispatchService.cs
Line 202 in 1f248a4
All the lines listed can throw
TaskCanceledException
s. The deadlock occurs if any of the first group throw and none of the second group throw. At that point, the app will halt forever when it awaits the Reader's completion.Steps to Reproduce
It's very difficult to reproduce. The only example I have is a dump of https://github.com/tgstation/tgstation-server deadlocked at this point. For security reasons (bot tokens and SQL credentials) I cannot share it publicly.
Expected Behavior
ResponderDispatchService.DisposeAsync() should complete eventually.
Current Behavior
ResponderDispatchService.DisposeAsync() can deadlock waiting for a channel completion event that never comes.
Library / Runtime Information
dotnet: 6.0
Remora.Discord: 2022.48.0
tgstation-server: 5.12.4
OS: Windows Server 2022
The text was updated successfully, but these errors were encountered: