-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core] Add a thread checker #47966
[core] Add a thread checker #47966
Conversation
Signed-off-by: dentiny <[email protected]>
@@ -41,3 +41,10 @@ cc_library( | |||
"@nlohmann_json", | |||
], | |||
) | |||
|
|||
cc_library( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to propose we separate libraries and binaries into separate targets, as I mentioned here: #47928 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand thread_checker
is duplicate in two targets: utils
and thread_checker
, since utils
simply declares all header/impls as its dependency; I will make another PR to cleanup.
src/ray/util/thread_checker.cc
Outdated
|
||
namespace ray { | ||
|
||
bool ThreadChecker::ok() const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The overall runtime cost is ignorably small -- I'm pretty confident I could squeeze these CPU cycles from elsewhere.
If developers are really concerns about it (which means they care about nano seconds latency), they could use RAY_DCHECK
to ignore at release.
hmm. I'm ok with this since However let's come up with a good name. Next, I don't think we need the Next. do we want it copyable/movable? needs some thoughts.
thoughts? |
Atomic variable is not copiable or movable, so same as |
Signed-off-by: dentiny <[email protected]>
Sounds good! Updated to |
then we'd have some bad times. Most of our usage is about "this function and its callbacks are on same thread". if we can't copy/move the atomic to lambda callback we can't really use it. |
No worries at all, generally thread checker should be data member, a Checkout current code impl,
the callback is already capturing this .
|
@rynewang Let me clarify more Intended use case is
No copy or move is needed. Let me know if you have another thought, thanks. |
Makes sense. I found actually we already have something similar to yours - with mutex. Once this PR is merged we can replace that thing with yours. Line 354 in 6d8953b
|
Sounds good, I could followup with another PR after this one merged. |
@rynewang Could you please help me to merge the PR? Thanks! |
`ThreadChecker` is a cheap runtime checker, which only takes two atomic operations, which - guards against threading invariants; - serves for better code readability, so code readers and developers are easy to understand thread execution status Signed-off-by: dentiny <[email protected]> Signed-off-by: ujjawal-khare <[email protected]>
`ThreadChecker` is a cheap runtime checker, which only takes two atomic operations, which - guards against threading invariants; - serves for better code readability, so code readers and developers are easy to understand thread execution status Signed-off-by: dentiny <[email protected]> Signed-off-by: ujjawal-khare <[email protected]>
`ThreadChecker` is a cheap runtime checker, which only takes two atomic operations, which - guards against threading invariants; - serves for better code readability, so code readers and developers are easy to understand thread execution status Signed-off-by: dentiny <[email protected]> Signed-off-by: ujjawal-khare <[email protected]>
As @rynewang mentioned at comment #47793 (comment), ray core uses ASIO io context for event scheduling and callback execution.
Merely reading through the code, it's not always easy to justify correct threading usage, whether data race.
TSAN is a great tool to verify against data race and invalid memory access at runtime, but is generally not enough, since
ThreadChecker
is a cheap runtime checker, which only takes two atomic operations, whichIntended usage for
GcsJobManager
: