Program hangs when instantiating a GP using multiprocessing #521

brendenpetersen · 2017-07-06T20:21:11Z

I'm trying to do what seems like a simple task: use multiprocessing to parallelize the optimize() call over many unique GPs. Here's a minimal example of what I'm trying to do.

from GPy.core.gp import GP
from GPy.kern import White
from GPy.likelihoods.gaussian import Gaussian
from GPy.inference.latent_function_inference.laplace import Laplace
from multiprocessing import Pool
import numpy as np

# Wrapper needed so the function is pickleable, which is required for multiprocessing.Pool
def opt_wrapper(gp):
   return gp.optimize() # Can replace with 'return 1' and program still hangs

size = 100 # Program works when this is low enough
inference_method = Laplace() # Program works when this is None
models = [GP(X=np.arange(size).reshape(size,1), Y=np.arange(size).reshape(size,1), kernel=White(1), likelihood=Gaussian(), inference_method=inference_method) for _ in range(1)]

print "Starting pool..."
pool = Pool(1)
print pool.map(opt_wrapper, models)
pool.close()
pool.join()

The program simply hangs after printing "Starting pool..." Annoyingly, it also results in a zombie process for each worker in the pool (just 1 in this example).

The program works just fine when size is less than about 60. However, when size is larger, it simply hangs after printing "Starting pool..." Note you can replace pool.map with the built-in map and it works just fine, so it seems to be an issue of creating a GP with Laplace over a certain size within a new process.

The program works just fine when any one of the following conditions are true:

When size is less than about 60. For larger values, it hangs.
When Laplace() is replaced with None. This is because the Gaussian likelihood then defaults to ExactGaussianInference(); however, my actual project's likelihood is custom and requires Laplace().
When pool.map is replaced with the built-in map.

Lastly, it still breaks when you replace return gp.optimize() with return 1. Similarly, the following program hangs (same imports):

def make_gp(dummy):
   inference_method = Laplace() # Again, program works when Laplace() becomes None
   gp = GP(X=np.arange(size).reshape(size,1), Y=np.arange(size).reshape(size,1), kernel=White(1), likelihood=Gaussian(), inference_method=inference_method)
   return 1

size = 100 # Again, program works when this is small
pool = Pool(1)
print pool.map(make_gp, ['dummy']) # Again, works with `map`
pool.close()
pool.join()

It seems to be an issue of instantiating/copying a GP--both with Laplace and above a certain size--within a new process. Seems highly odd and highly specific. Any help greatly appreciated.

The text was updated successfully, but these errors were encountered:

avehtari · 2017-07-07T16:59:55Z

Are GPs instantiated in the same node? Are you running out of memory? When my student @opkoisti tested distributed GP approach with GPy, there were some problems with the implementation of GPy. Unfortunately I don't remember whether they were properly fixed. Maybe @alansaul remembers?

brendenpetersen · 2017-07-07T19:16:21Z

Thank you for the response! Yes, they're all on the same node. I'm just trying this out on a single laptop with 4 cores. (I'm also considering distributed, though it wouldn't be with the multiprocessing package.) I'm not running out of memory either--good question. The memory requirements are very small.

It does seem to be a GPy implementation issue. I've tried toy examples with other objects and it works just fine. I know multiprocessing has issues with unpickleable data (i.e. the need to wrap the optimize() call). It could also be an issue with multiple instances of multiprocessing? I looked into what's happening under the hood when Laplace() is called, but it goes pretty deep with some other packages. I also tried replacing Laplace() with expectation_propagation.EP(), which resulted in some NotImplemented error about object 'super' not having __getstate__.

I look forward to help from @opkoisti and/or @alansaul. Thanks again.

mzwiessele · 2017-07-09T06:30:12Z

I have just run your code and it seemed working fine for me. Which versions are you using? Which python are you running on?

…

On 7 Jul 2017, at 20:16, brendenpetersen ***@***.***> wrote: Thank you for the response! Yes, they're all on the same node. I'm just trying this out on a single laptop with 4 cores. (I'm also considering distributed, though it would be with the multiprocessing package.) I'm not running out of memory either--good question. The memory requirements are very small. It does seem to be a GPy implementation issue. I've tried toy examples with other objects and it works just fine. I know multiprocessing has issues with unpickleable data (i.e. the need to wrap the optimize() call). It could also be an issue with multiple instances of multiprocessing? I looked into what's happening under the hood when Laplace() is called, but it goes pretty deep with some other packages. I also tried replacing Laplace() with expectation_propagation.EP(), which resulted in some NotImplemented error about object 'super' not having __getstate__. I look forward to help from @opkoisti <https://github.com/opkoisti> and/or @alansaul <https://github.com/alansaul>. Thanks again. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#521 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABfcKcRU6Fo8mdyAxAu5FM4a0nv0Ct9Aks5sLoQHgaJpZM4OQIwA>.

brendenpetersen · 2017-07-10T16:19:37Z

Python 2.7.9 GPy 1.7.7 I will try a more recent version of Python2 and report back. EDIT: Also does not work on Python 2.7.13. Which versions were you using? If you increase `size` dramatically, does it still run for you? On Sat, Jul 8, 2017 at 11:30 PM Max Zwiessele <[email protected]> wrote:

…

I have just run your code and it seemed working fine for me. Which versions are you using? Which python are you running on? > On 7 Jul 2017, at 20:16, brendenpetersen ***@***.***> wrote: > > Thank you for the response! Yes, they're all on the same node. I'm just trying this out on a single laptop with 4 cores. (I'm also considering distributed, though it would be with the multiprocessing package.) I'm not running out of memory either--good question. The memory requirements are very small. > > It does seem to be a GPy implementation issue. I've tried toy examples with other objects and it works just fine. I know multiprocessing has issues with unpickleable data (i.e. the need to wrap the optimize() call). It could also be an issue with multiple instances of multiprocessing? I looked into what's happening under the hood when Laplace() is called, but it goes pretty deep with some other packages. I also tried replacing Laplace() with expectation_propagation.EP(), which resulted in some NotImplemented error about object 'super' not having __getstate__. > > I look forward to help from @opkoisti <https://github.com/opkoisti> and/or @alansaul <https://github.com/alansaul>. Thanks again. > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub < #521 (comment)>, or mute the thread < https://github.com/notifications/unsubscribe-auth/ABfcKcRU6Fo8mdyAxAu5FM4a0nv0Ct9Aks5sLoQHgaJpZM4OQIwA >. > — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#521 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AOkC3PF6LBYndMwJGVPZtU6U5WA9I2X4ks5sMHN3gaJpZM4OQIwA> .

mzwiessele · 2017-07-11T11:03:05Z

I am running on Python 2.7.9 too...

brendenpetersen · 2017-07-12T18:30:44Z

@mzwiessele Can you try increasing size and see if it still works?

Could you check your versions of GPy and numpy (I'm on 1.13.1)?

ahartikainen · 2018-03-19T10:04:50Z

Hi, this is probably a problem with the Jupyter Notebook.

You need to create an external python file (e.g. my_func.py) and put the opt_wrapper in there.

Then import the opt_wrapper and call your code normally.

from GPy.core.gp import GP
from GPy.kern import White
from GPy.likelihoods.gaussian import Gaussian
from GPy.inference.latent_function_inference.laplace import Laplace
from multiprocessing import Pool
import numpy as np
from myfunc import opt_wrapper

size = 100 # Program works when this is low enough
inference_method = Laplace() # Program works when this is None
models = [GP(X=np.arange(size).reshape(size,1), Y=np.arange(size).reshape(size,1), kernel=White(1), likelihood=Gaussian(), inference_method=inference_method) for _ in range(1)]

print "Starting pool..."
pool = Pool(1)
print pool.map(opt_wrapper, models)
pool.close()
pool.join()

If you are calling your script from the commanline, use if __name__ == '__main__' block to get the multiprocessing working. This way you don't need the external file for your function.

from GPy.core.gp import GP
from GPy.kern import White
from GPy.likelihoods.gaussian import Gaussian
from GPy.inference.latent_function_inference.laplace import Laplace
from multiprocessing import Pool
import numpy as np

def opt_wrapper(gp):
    return gp.optimize()

if __name__ == '__main__':
    size = 100 # Program works when this is low enough
    inference_method = Laplace() # Program works when this is None
    models = [GP(X=np.arange(size).reshape(size,1), Y=np.arange(size).reshape(size,1), kernel=White(1), likelihood=Gaussian(), inference_method=inference_method) for _ in range(1)]

    print "Starting pool..."
    pool = Pool(1)
    print pool.map(opt_wrapper, models)
    pool.close()
    pool.join()

Also, consider to wrap your pool.close and pool.join in Try-Finally block. In python 3 one can call with Pool(1) as p: .

try:
    print "Starting pool..."
    pool = Pool(1)
    print pool.map(opt_wrapper, models)
finally:
    pool.close()
    pool.join()

brendenpetersen · 2018-03-19T16:47:40Z

Hi @ahartikainen, I was not using Jupyter Notebook. Python was executed from command-line. So I don't think those changes would fix the problem.

I've moved on from this project, but the issue was actually a limitation with Python's multiprocessing, which uses OS pipes under the hood and is therefore limited by buffer sizes. This explains why the program works when size is small enough, as it puts it under the buffer size, and why it worked for @mzwiessele, whose OS likely had a different buffer size.

One (of many) explanations here: https://sopython.com/canon/82/programs-using-multiprocessing-hang-deadlock-and-never-complete/

ahartikainen · 2018-03-19T17:16:51Z

Thanks for the follow up. Interesting problem. What was your OS if I can ask.

brendenpetersen · 2018-03-19T17:20:26Z

@ahartikainen no problem! It's a bummer, since GPs can get quite data-intensive. I imagine it can be fixed by chunking up the size of the GP, but that was more work than I could afford at the time.

I'm on Mac OS X El Capitan, which I believe can handle up to 64 kB buffers, but I have no idea how they break that up.

patel-zeel · 2021-02-16T12:53:38Z

I had a similar problem while doing benchmarking on AMD and Intel CPUs. AMD was too bad with GPy multiprocessing but Intel did well. Does anyone have a similar experience?

thihaa2019 · 2023-12-30T08:04:40Z

I have same issue with my Mac. When I try to run GPy parallel. on MacOS.

mzwiessele closed this as completed Jul 11, 2017

mzwiessele reopened this Jul 11, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Program hangs when instantiating a GP using multiprocessing #521

Program hangs when instantiating a GP using multiprocessing #521

brendenpetersen commented Jul 6, 2017

avehtari commented Jul 7, 2017

brendenpetersen commented Jul 7, 2017 •

edited

Loading

mzwiessele commented Jul 9, 2017 via email

brendenpetersen commented Jul 10, 2017 via email •

edited

Loading

mzwiessele commented Jul 11, 2017

brendenpetersen commented Jul 12, 2017

ahartikainen commented Mar 19, 2018 •

edited

Loading

brendenpetersen commented Mar 19, 2018

ahartikainen commented Mar 19, 2018

brendenpetersen commented Mar 19, 2018

patel-zeel commented Feb 16, 2021

thihaa2019 commented Dec 30, 2023

Program hangs when instantiating a GP using multiprocessing #521

Program hangs when instantiating a GP using multiprocessing #521

Comments

brendenpetersen commented Jul 6, 2017

avehtari commented Jul 7, 2017

brendenpetersen commented Jul 7, 2017 • edited Loading

mzwiessele commented Jul 9, 2017 via email

brendenpetersen commented Jul 10, 2017 via email • edited Loading

mzwiessele commented Jul 11, 2017

brendenpetersen commented Jul 12, 2017

ahartikainen commented Mar 19, 2018 • edited Loading

brendenpetersen commented Mar 19, 2018

ahartikainen commented Mar 19, 2018

brendenpetersen commented Mar 19, 2018

patel-zeel commented Feb 16, 2021

thihaa2019 commented Dec 30, 2023

brendenpetersen commented Jul 7, 2017 •

edited

Loading

brendenpetersen commented Jul 10, 2017 via email •

edited

Loading

ahartikainen commented Mar 19, 2018 •

edited

Loading