-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelization issues with pocoMC #591
Comments
Hi @lisaotten, I had a quick look at the plot. Not sure the plots are really different. You should do some actual test if the distributions are different, e.g. by comparing the modes of your multi-dimensional distributions or using something like: Hope this helps. |
Hi @matthiaskoenig, Thanks for your reply! |
The first thing I can think of is that if the seeds are being set from the system clock (which they are by default), you might be getting the same seed on multiple runs? I can imagine reasons for both the parallel and non-parallel runs to end up this way, so you might want to try explicitly setting the seed for each run manually, to ensure that they're all unique. You could also examine the individual results to see if this is actually happening or not. |
I played around with the seed quite a bit in finding a possible source for the errors, but even setting the seed manually resulted in the same distributions. |
I have attached a combined corner plot that maybe illustrates the issues a little better: The black line corresponds to a non-parallel run, while the other three lines correspond to parallel runs using the multiprocess, multiprocessing and pathos packages. They were all created using the same fixed seed at the start of the runs. The last parameter in the corner plot corresponds to the deviation of the parameter fits from the data points that we are trying to analyze and is much larger for the parallel run. This clearly indicates that the parallel runs fit the data much worse. |
I wonder if it is possible to create a small problem to look at? I can see
there appears to be a difference. What happens if you run it again, do you
get a similar corner plot?
…On Wed, Sep 18, 2024 at 9:39 AM lisaotten ***@***.***> wrote:
I have attached a combined corner plot that maybe illustrates the issues a
little better: The black line corresponds to a non-parallel run, while the
other three lines correspond to parallel runs using the multiprocess,
multiprocessing and pathos packages. They were all created using the same
fixed seed at the start of the runs.
The last parameter in the corner plot corresponds to the deviation of the
parameter fits from the data points that we are trying to analyze and is
much larger for the parallel run. This clearly indicates that the parallel
runs fit the data much worse.
corner_16Dpococheck_3.png (view on web)
<https://urldefense.com/v3/__https://github.com/user-attachments/assets/3656420b-7719-4161-b5ff-cbecb31a2313__;!!K-Hz7m0Vt54!gttSNTitQRpvNxt7NVNEJMih4al3YU4A6XOICHbJTWgYKRSHGcXs3IuufcYhNZu4FC7hF4Kiu2goWAJu6zfQoC1Vq0uYCA$>
—
Reply to this email directly, view it on GitHub
<https://urldefense.com/v3/__https://github.com/sys-bio/tellurium/issues/591*issuecomment-2358945406__;Iw!!K-Hz7m0Vt54!gttSNTitQRpvNxt7NVNEJMih4al3YU4A6XOICHbJTWgYKRSHGcXs3IuufcYhNZu4FC7hF4Kiu2goWAJu6zfQoC10I_7x7Q$>,
or unsubscribe
<https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AAIBSDRLGAZ5HYL34HTUPTDZXGUDNAVCNFSM6AAAAABMA2544KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJYHE2DKNBQGY__;!!K-Hz7m0Vt54!gttSNTitQRpvNxt7NVNEJMih4al3YU4A6XOICHbJTWgYKRSHGcXs3IuufcYhNZu4FC7hF4Kiu2goWAJu6zfQoC3RPX1GBw$>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
--
Herbert Sauro, Professor
Director: NIH Center for model reproducibility
University of Washington, Bioengineering
206-685-2119, www.sys-bio.org, http://reproduciblebiomodels.org/
Mobile: 206-880-8093
***@***.***
Books: http://books.analogmachine.org/
|
I’ve been thinking more about this. Would it be possible to create a simple
one variable model, eg -> x ->. With simple mass action on each step? That
would create a one element corner plot. This would be easier to debug.
|
When employing the pocoMC package for Bayesian Inference runs using tellurium for modeling, we have encountered issues with parallelization. Using multiprocess(ing), we noticed a very big discrepancy between the results obtained by parallelized and non-parallelized runs (see also attached corner plots). Both runs run through smoothly without any error messages or other large differences. I have been able to reproduce the results of both runs separately multiple times both on a HPC and my personal notebook. Besides the change of parallelization, all other parameters are kept exactly the same. Changes in the number of parallel kernels do not seem to change these results. The results of the non-parallelized run are what we would expect as the correct results from experience.
I have attached a self-contained code including the environment I am running it on. The config-file has an option under bayesian_inference to turn parallelization on/off as well as specify the number of kernels.
parallel.pdf
not_parallel.pdf
Bayesian_Transporter.zip
The text was updated successfully, but these errors were encountered: