Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: ResponderDispatchService.DisposeAsync() can deadlock #305

Closed
Cyberboss opened this issue Jun 3, 2023 · 6 comments
Closed

[Bug]: ResponderDispatchService.DisposeAsync() can deadlock #305

Cyberboss opened this issue Jun 3, 2023 · 6 comments
Labels
bug Something isn't working

Comments

@Cyberboss
Copy link
Contributor

Cyberboss commented Jun 3, 2023

Description

The _finalizer Task relies on the _dispatcher Task closing the channel to be able to complete. But it's possible for this to never happen.

The channel completion call relies on several throws possible from the CancellationToken not happening.

The deadlock is usually avoided because the _finalizer Task also usually throws out before it gets to that point

All the lines listed can throw TaskCanceledExceptions. The deadlock occurs if any of the first group throw and none of the second group throw. At that point, the app will halt forever when it awaits the Reader's completion.

Steps to Reproduce

It's very difficult to reproduce. The only example I have is a dump of https://github.com/tgstation/tgstation-server deadlocked at this point. For security reasons (bot tokens and SQL credentials) I cannot share it publicly.

Expected Behavior

ResponderDispatchService.DisposeAsync() should complete eventually.

Current Behavior

ResponderDispatchService.DisposeAsync() can deadlock waiting for a channel completion event that never comes.

Library / Runtime Information

dotnet: 6.0
Remora.Discord: 2022.48.0
tgstation-server: 5.12.4
OS: Windows Server 2022

@Cyberboss Cyberboss added the bug Something isn't working label Jun 3, 2023
Cyberboss added a commit to tgstation/tgstation-server that referenced this issue Jun 3, 2023
@Cyberboss
Copy link
Contributor Author

Also the .Completion event won't fire if there are items still in the channel https://stackoverflow.com/a/66521303

@VelvetToroyashi
Copy link
Contributor

VelvetToroyashi commented Jun 3, 2023

Quite the peculiar set of issues, that's for sure. I've been hoping to rewrite mass swaths of Remora's codebase[1][2], so I suppose I'll definitely tack this onto the list; rewriting dispatch would certainly be a worthwhile investment based on my investigations[3]

1 #254
2 Remora.Commands PR
3 Discussion in the Discord

@Cyberboss
Copy link
Contributor Author

This hacky reflection workaround was able to solve our issue: tgstation/tgstation-server#1509

Nihlus added a commit that referenced this issue Jun 6, 2023
The dispatch service used its internal cancellation token source
incorrectly, leading to deadlocking behaviour and lost events on
shutdown. This change removes the usage from all but the intended one
(to signal responders when they should cancel) and moves the stop
responsibility to the data channels instead.

Fixes #305.
@Nihlus
Copy link
Member

Nihlus commented Jun 6, 2023

Interesting - thank you for the in-depth analysis. I've pushed a suggested fix to the deadlock-fix branch, which should alleviate the problem and fix another issue I uncovered. Please take a look when you have time.

@Nihlus Nihlus closed this as completed in 8a2ece1 Jun 17, 2023
VelvetToroyashi pushed a commit to VelvetToroyashi/Remora.Discord that referenced this issue Jun 17, 2023
The dispatch service used its internal cancellation token source
incorrectly, leading to deadlocking behaviour and lost events on
shutdown. This change removes the usage from all but the intended one
(to signal responders when they should cancel) and moves the stop
responsibility to the data channels instead.

Fixes Remora#305.
@Cyberboss
Copy link
Contributor Author

I've integrated this in my software and haven't heard the same bug report, so it looks to be working

@Nihlus
Copy link
Member

Nihlus commented Aug 1, 2023

Excellent!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants