-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Calling from Python multiprocessing causes hanging #206
Comments
from multiprocessing import Pool
def printpid(i):
# from julia import Julia
# Julia(debug=True)
from julia.Base import println, getpid
println("pid = ", getpid(), "i = ", i)
pool = Pool()
pool.map(printpid, range(4)) works for me. Tested with master. |
Note that See, e.g., https://docs.julialang.org/en/v1/stdlib/Distributed/index.html#Distributed.pmap |
Thanks for your reply. I ended up remedying this situation by not calling It's a bit of a specific scenario I'm involved in, as I'm trying to use
I'm currently experiencing segfaults when the parallel pool is closed (the results themselves seem fine). |
Yeah,
instead of segfault. By "the results themselves seem fine", you mean the returned results are correct inside the |
Yes that's correct, the results computed by my function seem to be correct (in line with running them serially outside of
If that's helpful |
These lines
could mean it's GC related. But this simple script import os
import threading
import time
def my_ident(_):
time.sleep(0.001)
return (os.getpid(), threading.get_ident(), threading.main_thread().ident)
if __name__ == "__main__":
from multiprocessing import Pool
import pprint
num_pool = 4
pool = Pool(num_pool)
results = pool.map(my_ident, range(1000))
pprint.pprint(sorted(set(results)))
assert all(r[1] == r[2] for r in results) shows that the function is executed in the main thread. So I'm not sure why Does |
Using Distributed to fake multiprocessing is not super hard. Here is a quick hack: # distributed.py
import os
def worker(i):
return (i, os.getpid())
def main():
from julia import Main
Main.eval("""
using Distributed
@everywhere using PyCall
@everywhere pushfirst!(PyVector(pyimport("sys")["path"]), @__DIR__)
@everywhere function runpython(fullname, arg)
m, f = rsplit(fullname, ".", limit=2)
return pyimport(m)[f](arg)
end
function mappy(fullname, arglist)
return Distributed.pmap(runpython, Iterators.cycle([fullname]), arglist)
end
""")
results = Main.mappy("distributed.worker", range(3))
print(results)
if __name__ == "__main__":
from julia.Distributed import addprocs
addprocs(3)
main() |
Unfortunately in |
In my example, each Julia worker calls Python function Side note: To use PyJulia inside def worker(i):
from julia import Julia
Julia(init_julia=False)
from julia import ...
# do Python & Julia stuff |
JuliaPy/PyCall.jl#603 may solve this issue |
I will give it a try, thank you! |
Also try pyjulia master now that #216 is merged. |
Hmm unfortunately both of those changes still result in similar stacktrace dumps
|
From the path component |
I have tried the combination of #216 and JuliaPy/PyCall.jl#603 but unfortunately I'm still getting these segfaults. It's a headscratcher. I won't be able to followup on this further unfortunately as I am leaving my current job, but I really appreciate your time into looking into this thus far.
|
Thanks for the confirmation! I guess I better run a MWE myself for digging it further. |
Confirming I have this problem too. A simple include with multiprocessing gives me errors much like the one above. |
By the way, has this issue been solved in the latest pyjulia (python package) with PyCall (Julia Package)? What should we do if we want to solve this problem? Is this issue related to some issues like "julia processes remaining unfinished and the python script moving on"? |
Also asking if anyone has a solution for this? Running @tkf's MWE from #206 (comment) on Linux / Julia 1.5 / Python 3.7.3 / pyjulia 0.5.3 / PyCall v1.91.4 via
Maybe
|
I think what I meant was that Now that Julia has a proper stable threading interface, I think it's better to properly handle GIL in PyCall. It's probably possible to throw a Python error when called from a non-primary thread before unsafely invoking Julia runtime. |
Sorry, I think I'm not understanding something. On the worker side, I think the following confirms the call into Julia is only done on the main thread: # tweak your MWE to this:
def printpid(i):
print("tid=", threading.current_thread(), " pid=", os.getpid())
println("pid = ", getpid(), "i = ", i)
# now this is printed before the segfault:
tid= <_MainThread(MainThread, started 139989811532352)> pid= 293764
tid= <_MainThread(MainThread, started 139989811532352)> pid= 293765
tid= <_MainThread(MainThread, started 139989811532352)> pid= 293766
tid= <_MainThread(MainThread, started 139989811532352)> pid= 293767 and I think the segfault originates in the Julia call, not in any of the communication code on either worker/master side, so why are threads/GIL relevant here? |
I missed this. import multiprocessing
from multiprocessing import Pool
def printpid(i):
from julia.Base import println, getpid
println("pid = ", getpid(), "i = ", i)
if __name__ == "__main__":
multiprocessing.set_start_method("spawn") # added this
pool = Pool()
pool.map(printpid, range(4))
print("DONE!") produces a slightly "better" result in the sense that I can now see Of course, this doesn't explain why the fork-based method doesn't work. But IIUC it's very tricky to use fork with multithreaded software. I think
It doesn't happen in my MWE but the scenario I described could happen if you return something from the function you passed to |
Thanks yea, same behavior for me with that script. However, Actually, it appears you can use import multiprocessing
from multiprocessing import Pool
# uncomment to trigger segfault:
# from julia.Base import println, getpid
def printpid(i):
from julia.Base import println, getpid
println("pid = ", getpid(), "i = ", i)
pool = Pool()
pool.map(printpid, range(4))
print("DONE!") That may be enough of a workaround in my case, testing... |
Do you mean with |
With dynamically linked Python or custom sysimage, not |
Hello! I'm currently trying to invoke a julia function via pyjulia, which works great in the serial python case, but causes hanging for some reason when invoked in python multiprocessing pool (through a map operation, say). Has anyone else run into this problem and, if so, have they found any solution to this issue?
The text was updated successfully, but these errors were encountered: