-
Notifications
You must be signed in to change notification settings - Fork 565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Program hangs when instantiating a GP using multiprocessing #521
Comments
Thank you for the response! Yes, they're all on the same node. I'm just trying this out on a single laptop with 4 cores. (I'm also considering distributed, though it wouldn't be with the multiprocessing package.) I'm not running out of memory either--good question. The memory requirements are very small. It does seem to be a GPy implementation issue. I've tried toy examples with other objects and it works just fine. I know multiprocessing has issues with unpickleable data (i.e. the need to wrap the I look forward to help from @opkoisti and/or @alansaul. Thanks again. |
I have just run your code and it seemed working fine for me. Which versions are you using? Which python are you running on?
… On 7 Jul 2017, at 20:16, brendenpetersen ***@***.***> wrote:
Thank you for the response! Yes, they're all on the same node. I'm just trying this out on a single laptop with 4 cores. (I'm also considering distributed, though it would be with the multiprocessing package.) I'm not running out of memory either--good question. The memory requirements are very small.
It does seem to be a GPy implementation issue. I've tried toy examples with other objects and it works just fine. I know multiprocessing has issues with unpickleable data (i.e. the need to wrap the optimize() call). It could also be an issue with multiple instances of multiprocessing? I looked into what's happening under the hood when Laplace() is called, but it goes pretty deep with some other packages. I also tried replacing Laplace() with expectation_propagation.EP(), which resulted in some NotImplemented error about object 'super' not having __getstate__.
I look forward to help from @opkoisti <https://github.com/opkoisti> and/or @alansaul <https://github.com/alansaul>. Thanks again.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#521 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABfcKcRU6Fo8mdyAxAu5FM4a0nv0Ct9Aks5sLoQHgaJpZM4OQIwA>.
|
Python 2.7.9
GPy 1.7.7
I will try a more recent version of Python2 and report back.
EDIT: Also does not work on Python 2.7.13. Which versions were you using? If you increase `size` dramatically, does it still run for you?
On Sat, Jul 8, 2017 at 11:30 PM Max Zwiessele <[email protected]>
wrote:
… I have just run your code and it seemed working fine for me. Which
versions are you using? Which python are you running on?
> On 7 Jul 2017, at 20:16, brendenpetersen ***@***.***>
wrote:
>
> Thank you for the response! Yes, they're all on the same node. I'm just
trying this out on a single laptop with 4 cores. (I'm also considering
distributed, though it would be with the multiprocessing package.) I'm not
running out of memory either--good question. The memory requirements are
very small.
>
> It does seem to be a GPy implementation issue. I've tried toy examples
with other objects and it works just fine. I know multiprocessing has
issues with unpickleable data (i.e. the need to wrap the optimize() call).
It could also be an issue with multiple instances of multiprocessing? I
looked into what's happening under the hood when Laplace() is called, but
it goes pretty deep with some other packages. I also tried replacing
Laplace() with expectation_propagation.EP(), which resulted in some
NotImplemented error about object 'super' not having __getstate__.
>
> I look forward to help from @opkoisti <https://github.com/opkoisti>
and/or @alansaul <https://github.com/alansaul>. Thanks again.
>
> —
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub <
#521 (comment)>, or
mute the thread <
https://github.com/notifications/unsubscribe-auth/ABfcKcRU6Fo8mdyAxAu5FM4a0nv0Ct9Aks5sLoQHgaJpZM4OQIwA
>.
>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#521 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AOkC3PF6LBYndMwJGVPZtU6U5WA9I2X4ks5sMHN3gaJpZM4OQIwA>
.
|
I am running on Python 2.7.9 too... |
@mzwiessele Can you try increasing size and see if it still works? Could you check your versions of GPy and numpy (I'm on 1.13.1)? |
Hi, this is probably a problem with the Jupyter Notebook. You need to create an external python file (e.g. Then import the opt_wrapper and call your code normally.
If you are calling your script from the commanline, use
Also, consider to wrap your
|
Hi @ahartikainen, I was not using Jupyter Notebook. Python was executed from command-line. So I don't think those changes would fix the problem. I've moved on from this project, but the issue was actually a limitation with Python's multiprocessing, which uses OS pipes under the hood and is therefore limited by buffer sizes. This explains why the program works when size is One (of many) explanations here: https://sopython.com/canon/82/programs-using-multiprocessing-hang-deadlock-and-never-complete/ |
Thanks for the follow up. Interesting problem. What was your OS if I can ask. |
@ahartikainen no problem! It's a bummer, since GPs can get quite data-intensive. I imagine it can be fixed by chunking up the size of the GP, but that was more work than I could afford at the time. I'm on Mac OS X El Capitan, which I believe can handle up to 64 kB buffers, but I have no idea how they break that up. |
I had a similar problem while doing benchmarking on AMD and Intel CPUs. AMD was too bad with GPy multiprocessing but Intel did well. Does anyone have a similar experience? |
I have same issue with my Mac. When I try to run GPy parallel. on MacOS. |
I'm trying to do what seems like a simple task: use multiprocessing to parallelize the optimize() call over many unique GPs. Here's a minimal example of what I'm trying to do.
The program simply hangs after printing "Starting pool..." Annoyingly, it also results in a zombie process for each worker in the pool (just 1 in this example).
The program works just fine when
size
is less than about 60. However, whensize
is larger, it simply hangs after printing "Starting pool..." Note you can replacepool.map
with the built-inmap
and it works just fine, so it seems to be an issue of creating a GP with Laplace over a certain size within a new process.The program works just fine when any one of the following conditions are true:
size
is less than about 60. For larger values, it hangs.Laplace()
is replaced withNone
. This is because the Gaussian likelihood then defaults to ExactGaussianInference(); however, my actual project's likelihood is custom and requires Laplace().pool.map
is replaced with the built-inmap
.Lastly, it still breaks when you replace
return gp.optimize()
withreturn 1
. Similarly, the following program hangs (sameimports
):It seems to be an issue of instantiating/copying a GP--both with Laplace and above a certain size--within a new process. Seems highly odd and highly specific. Any help greatly appreciated.
The text was updated successfully, but these errors were encountered: