Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix halt on 'PROXYSQL RESUME' command #4757

Merged
merged 3 commits into from
Nov 26, 2024
Merged

Conversation

JavierJF
Copy link
Collaborator

Issue Description

During a RESUME operation, if a 'MySQL_Thread' is bootstrapping listeners (in MySQL_Thread::run_BootstrapListener) and detect that MySQL_Threads_Handler::bootstrapping_listeners is 'false', the thread prematurely shuts down its own bootstrapping flag from 'mypolls' (ProxySQL_Poll::bootstrapping_listeners). Since this thread wont ever bootstrap its corresponding listening sockets, the other working threads will be stalled waiting on it, eventually triggering the watchdog and crashing the instance.

Solution

Simplified the logic using a unique flag for bootstrapping (MySQL_Threads_Handler::bootstrapping_listeners) and introduced a sensible delay to reduce the potential overhead of the worker threads busy-waiting for its time to start their listening sockets, as well as the counterpart overhead on the Admin thread while performing the PAUSE operation.

During a RESUME operation, if a 'MySQL_Thread' is bootstrapping
listeners (in `MySQL_Thread::run_BootstrapListener`) and detect that
`MySQL_Threads_Handler::bootstrapping_listeners` is 'false', the thread
prematurely shuts down its own bootstrapping flag from 'mypolls'
(`ProxySQL_Poll::bootstrapping_listeners`). Since this thread wont ever
bootstrap its corresponding listening sockets, the other working threads
will be stalled waiting on it, eventually triggering the watchdog and
crashing the instance.

Since the bootstrapping operation is sequential, it's expected that all
the threads but the one starting their listening sockets are in an
active wait. A sensible delay has been introduced to reduce the overhead
of such wait.
Since the operation of stopping each worker thread listeners is
performed during 'maintenance_loops', the active wait taking place in
'listener_del' is likely to take some time. A sensible delay has been
added to reduce unneccesary load.
This is a follow-up of commit #19c8f8698
@renecannao renecannao merged commit 7c64542 into v2.7 Nov 26, 2024
86 of 116 checks passed
renecannao added a commit that referenced this pull request Nov 26, 2024
 Fix halt on 'PROXYSQL RESUME' command - Port of #4757
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants