Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiprocessing parallelization and deep copy gives unexpected results #563

Open
augeorge opened this issue Oct 6, 2022 · 2 comments
Open

Comments

@augeorge
Copy link

augeorge commented Oct 6, 2022

Hello,

I am trying to run tellurium / roadrunner simulations in parallel to do parameter sweeps/estimation but get unexpected results. I can sort of get around this by creating a wrapper function that loads a new roadrunner model before simulation, but this is slow if done for a large number of iterations. I noticed similar problems when doing copy.deepcopy(rr_model). Is there a best practice or other work-around for this issue?

Below is a simple example - the parallel pool gives incorrect results.

import numpy as np
import tellurium as te
import roadrunner
import copy
from sklearn.metrics import mean_squared_error as mse
import multiprocessing as mp
import sys
import os


print(f'tellurium version: {te.__version__}')
print(f'roadrunner version: {roadrunner.__version__}')
print(f'python version: {sys.version}')
print(f'OS version: {os.uname()}')


def f(k, y_true, rr_model):
    rr_model.resetToOrigin()
    rr_model.k1 = k[0]
    rr_model.km = k[1]
    y_pred = rr_model.simulate(0, 30, 200, ['time', 'Xo', 'S'])['S']
    return mse(y_true,y_pred)


def f_wrapper(k, y_true, rr_string):
    rr_model = te.loada(rr_string)
    return f(k, y_true, rr_model)


rr_string = '''
  $Xo -> S;   Xo/(km + S^h);
  S -> $w;  k1*S;

    # initialize
    h = 1;   # Hill coefficient
    k1 = 1;  km = 0.1;
    S = 1.5; Xo = 2

    at (time > 10): Xo = 5;
    at (time > 20): Xo = 2;
'''

rr_1 = te.loada(rr_string)
y_true = rr_1.simulate(0, 30, 200, ['time', 'Xo', 'S'])['S']
ks = [[0.1,0.01], [1, 0.1], [10,1]]

mse_serial = [f_wrapper(k, y_true, rr_string) for k in ks]
mse_serial2 = [f(k, y_true, rr_1) for k in ks]

p = mp.Pool(processes=2) 
mse_parallel = np.array(p.starmap(f_wrapper, [(k, y_true, rr_string) for k in ks]))
p.close()

p = mp.Pool(processes=2) 
mse_parallel2 = np.array(p.starmap(f, [(k, y_true, rr_1) for k in ks]))
p.close()

print(f'MSE - serial w/ wrapper: {mse_serial}')
print(f'MSE - serial w/o wrapper: {mse_serial2}')
print(f'MSE - parallel w/ wrapper: {mse_parallel}')
print(f'MSE - parallel w/o wrapper: {mse_parallel2}')

output:

roadrunner version: 2.2.1
python version: 3.7.14 (default, Sep  8 2022, 00:06:44) 
[GCC 7.5.0]
OS version: posix.uname_result(sysname='Linux', nodename='db9181e5dc3c', release='5.10.133+', version='#1 SMP Fri Aug 26 08:44:51 UTC 2022', machine='x86_64')
MSE - serial w/ wrapper: [12.267022835506769, 0.0, 2.0494559584927714]
MSE - serial w/o wrapper: [12.267022835506769, 0.0, 2.0494559584927714]
MSE - parallel w/ wrapper: [12.26702284  0.          2.04945596]
MSE - parallel w/o wrapper: [0. 0. 0.]
@augeorge augeorge changed the title multiprocessing parallelization gives unexpected results multiprocessing parallelization and deep copy gives unexpected results Oct 6, 2022
@augeorge
Copy link
Author

augeorge commented Oct 6, 2022

A deep copy of the roadrunner object gives incorrect results even with serial calculations.

An example using deep copy w/ serial calculations.

import numpy as np
import tellurium as te
import roadrunner
import copy
from sklearn.metrics import mean_squared_error as mse
import multiprocessing as mp
import sys
import os


print(f'OS version: {os.uname()}')
print(f'tellurium version: {te.__version__}')
print(f'roadrunner version: {roadrunner.__version__}')


def f(k, y_true, rr_model):
    rr_model.resetToOrigin()
    rr_model.k1 = k[0]
    rr_model.km = k[1]
    y_pred = rr_model.simulate(0, 30, 200, ['time', 'Xo', 'S'])['S']
    return mse(y_true,y_pred)


def f_wrapper(k, y_true, rr_string):
    rr_model = te.loada(rr_string)
    return f(k, y_true, rr_model)


rr_string = '''
  $Xo -> S;   Xo/(km + S^h);
  S -> $w;  k1*S;

    # initialize
    h = 1;   # Hill coefficient
    k1 = 1;  km = 0.1;
    S = 1.5; Xo = 2

    at (time > 10): Xo = 5;
    at (time > 20): Xo = 2;
'''
rr_1 = te.loada(rr_string)

y_true = rr_1.simulate(0, 30, 200, ['time', 'Xo', 'S'])['S']
ks = [[0.1,0.01], [1, 0.1], [10,1]]

mse_serial = [f_wrapper(k, y_true, rr_string) for k in ks]
mse_serial2 = [f(k, y_true, rr_1) for k in ks]
mse_serial3 = [f(k, y_true, copy.deepcopy(rr_1)) for k in ks]

print(f'MSE - serial w/ wrapper: {mse_serial}')
print(f'MSE - serial w/o wrapper: {mse_serial2}')
print(f'MSE - serial w/o wrapper deep copy: {mse_serial3}')

output:

OS version: posix.uname_result(sysname='Linux', nodename='db9181e5dc3c', release='5.10.133+', version='#1 SMP Fri Aug 26 08:44:51 UTC 2022', machine='x86_64')
tellurium version: 2.2.3
roadrunner version: 2.2.1
MSE - serial w/ wrapper: [12.267022835506769, 0.0, 2.0494559584927714]
MSE - serial w/o wrapper: [12.267022835506769, 0.0, 2.0494559584927714]
MSE - serial w/o wrapper deep copy: [0.0, 0.0, 0.0]

@luciansmith
Copy link
Contributor

Sorry for not getting to this sooner!

I'm not sure how deepcopy deals with Python objects that are wrappers for C objects from a shared library, but it looks like it's not working correctly here, at least. At any rate, I would recommend creating rr instances separately, and either just loading in the SBML file directly, or, to do this slightly faster, use the 'saveState' and 'loadState' (or saveStateS and loadStateS) functions to load in the state from the old roadrunner object to the new one.

We actually just published an article about libroadrunner, that includes bit about parallelization, including a new 'RoadRunnerMap' object that might be useful to you:

https://academic.oup.com/bioinformatics/article/39/1/btac770/6883908

You can see the Python scripts we used to create the figures (which use a couple different parallelization tricks) at

https://github.com/saurolab-papers/rr2.0_paper

So that will hopefully get you something that works for now.

However, I'm keeping this issue open because we should figure out how to make 'deepcopy' work, and I think your 'roadrunner without wrapper' should probably work, too. We'll investigate; thanks for posting! And apologies again for not following up sooner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants