-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[not urgent] MPI issue with mpirun #274
Comments
Using a request of 8 processes/gpus (
I wonder if there is some sort of node misconfiguration or hang. |
From here this looks like the executable may not be found on the filesystem on all nodes. I wonder if this suggests a GPFS synchronization hiccup. |
What level of importance is this? Its Sunday. Its Fathers day. Are you asking me to look at this right now @jchodera ? |
Sorry, no! Absolutely not! |
This was just "I am debugging, putting info online just in case useful to others". Will mark as non-urgent. |
Thank you. While I do not see anything obvious at the moment I will investigate with you Monday. |
When is a good time to try to debug this? |
Any chance we could look into this Tue at 12.00P EST? This isn't urgent, but I've noticed some other issues that seem to be do to minor environment differences among nodes. I'm still trying to do more debugging on that (and am slammed with meetings today). |
Sounds good. |
Was this handled as I believe you said we would review when discussing #275 which is indeed confirmed as a bug and will be some period of time before Adaptive issues a fix I suspect (unless we declare it critical). |
I'm having trouble with an MPI job (
3671529
).The error on
mpirun
launch is:Here is the request block:
Just documenting this for now while I debug in case someone else has a similar issue.
The text was updated successfully, but these errors were encountered: