-
Notifications
You must be signed in to change notification settings - Fork 898
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix sporadic ansible-runner macOS bug and remove extra listener thread #23229
Fix sporadic ansible-runner macOS bug and remove extra listener thread #23229
Conversation
I added in the exception spec because of the comment in the old thread code about "raises an exception immediately". I believe you were hitting problems where the listener.start wasn't yet called from the thread, and thus the later listener.stop could get called first. However, if we move listener.start to the main thread, then that case can't happen because start will definitely complete and internally transition to "started" (internal threads being actually started notwithstanding). |
1834baa
to
f09580b
Compare
It was found that on macOS a sporadic failure would occur where the internal listener, while started, wouldn't actually start listening right away, causing the listener to miss the file we were waiting for. The listen gem [creates an internal thread, `@run_thread`][1], which on most target systems is where the actually listening is done. However, [on macOS, `@run_thread` creates a second thread, `@worker_thread`][2], which does the actual listening. It's possible that although the listener is started, the `@worker_thread` hasn't actually started yet. This leaves a window where the target_path we are waiting on can actually be created before the `@worker_thread` is started and we "miss" the creation of the target_path. This commit ensures that we won't move on until that thread is ready, further ensuring we can't miss the creation of the target_path. Ansible::Runner#wait_for was also creating an extra thread when starting the listener, but this thread is unnecessary as the listen gem creates its own internal thread under the covers. [1]: https://github.com/guard/listen/blob/f186b2fa159a2458f3ff7e8680c3a4fcbdc636d1/lib/listen/adapter/base.rb#L75 [2]: https://github.com/guard/listen/blob/f186b2fa159a2458f3ff7e8680c3a4fcbdc636d1/lib/listen/adapter/darwin.rb#L49
f09580b
to
5ae61a1
Compare
Checked commit Fryguy@5ae61a1 with ruby 3.1.5, rubocop 1.56.3, haml-lint 0.51.0, and yamllint |
Oh yeah if |
Okay I've run this 10 times with the full suite in parallel (48 threads) on linux with no failures so pretty confident this will work |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So before just removing the thread I didn't want to understand why it was there originally.
In the first implementation of this we used INotify::Notifier
and notifier.run
was a blocking call, so when we moved to Listener we could have dropped the thread at that point.
LGTM
It was found that on macOS a sporadic failure would occur where the internal listener, while started, wouldn't actually start listening right away, causing the listener to miss the file we were waiting for.
The listen gem creates an internal thread,
@run_thread
, which on most target systems is where the actually listening is done. However, on macOS,@run_thread
creates a second thread,@worker_thread
, which does the actual listening. It's possible that although the listener is started, the@worker_thread
hasn't actually started yet. This leaves a window where the target_path we are waiting on can actually be created before the@worker_thread
is started and we "miss" the creation of the target_path. This commit ensures that we won't move on until that thread is ready, further ensuring we can't miss the creation of the target_path.Ansible::Runner#wait_for was also creating an extra thread when starting the listener, but this thread is unnecessary as the listen gem creates its own internal thread under the covers.
@agrare Please review. I'd also like it if you could verify this on linux, since the removal of that extra thread could affect that path. I'm concerned that while macOS has this special case, there might be a different, but similar, special case on Linux we need to account for after calling
listener.start
.I've also considering opening a bug in the upstream listen gem but would like your thoughts. IMO,
listener.start
shouldn't return until the underlying watching has actually started.