Not possible to merge runs with different number of batches #481

rodleiva · 2024-08-28T13:26:52Z

Dynesty version
2.1.4 installed with pip

Describe the bug
It is not possible to merge two runs with different number of batches (eg. run 1 with 4 batches, and run 2 with 2 batches).
As a consequence, it is also not possible to merge three or more runs, each with the same number of batches if the bounds are all different.

Setup
Here a simple code that reads three pickle files and attempt to merge them. Each pickle file has two batches, a baseline batch plus an additional batch in mode=auto. The latest means that the bounds for the baseline runs are -inf, inf, while for the additional batch will be different in each run.

ls_fnpickle = ['data/checkpoint_it860a.pickle',  'data/checkpoint_it860b.pickle', 'data/checkpoint_it860c.pickle']
ls_results = []
for i, fn_pickle in enumerate(ls_fnpickle):
    data = pickle.load(open(fn_pickle, 'rb'))
    sampler = data['sampler']
    results = sampler.results
    ls_results.append(results)

    results_merged = dyfunc.merge_runs(ls_results)

** Dynesty output **

Bug
In the previous example, merging run 1 with 2 works fine, but when merging the (1+2) with 3 generates this error in utils.py line 1980.

-> 1980 if np.all(base_info['bounds'] == new_info['bounds']):
   1981     bounds = base_info['bounds']
   1982     boffset = 0

ValueError: operands could not be broadcast together with shapes (4,2) (2,2)

For example, when attempting to merge two runs, imported from a checkpoint pickle file.
line 1980, in _merge_two
if np.all(base_info['bounds'] == new_info['bounds']):
ValueError: operands could not be broadcast together with shapes (4,2) (2,2)

Additional context
My quick and dirty workaround was to replace utils.py line 1980 with these

if len(base_info['bounds']) == len(new_info['bounds']) and np.all(base_info['bounds'] == new_info['bounds']):

but I think that _merge_two() could need more careful reworking. For example, it could merge all the baselines into a single batch if the bounds are the same, and then add the additional batches. I'm not sure about this anyway.

Notice that if I attempt to merge N runs, all with a baseline batch, all have bounds -inf, inf the merging works fine.
Other combinations of runs with M batches each should work fine as well as long as the batches bounds match. Each execution of _merge_two() will generate a M batches instead of 2xM as in the general case. I didn't test it.

The text was updated successfully, but these errors were encountered:

segasai · 2024-08-29T10:31:04Z

Thanks for the report! I will take a look.
Do you know if that worked ever worked before ?
Also if you could provide a simple self-contained test example with toy likelihood function where this failure is produced, that'd be useful (it'll save some time)

rodleiva · 2024-08-29T11:26:29Z

Hello Sergey,

I don't know if this worked before as I started to use this feature just recently, out of need. I checked that this was at least in the dynesty version 1.
Here an example to reproduce the problem:

import dynesty

def loglike(theta):
    x, y = theta
    return -0.5 * (x**2 + y**2)

def prior_transform(utheta):
    return 10.0 * (2.0 * utheta - 1.0)

def run_dynesty_sampler():
    sampler1 = dynesty.DynamicNestedSampler(loglike, prior_transform, ndim=2)
    sampler2 = dynesty.DynamicNestedSampler(loglike, prior_transform, ndim=2)

    # Baseline runs
    sampler1.run_nested(maxbatch=0)
    sampler2.run_nested(maxbatch=0)

    # Merge the two runs. This works fine
    results12 = dynesty.utils.merge_runs([sampler1.results, sampler2.results])

    print(f"Bounds:\n {sampler1.results.batch_bounds}")
    print(sampler2.results.batch_bounds)
    print(results12.batch_bounds)
    
    print(f"Number of samples:\n {sampler1.results.samples.shape}")
    print(sampler2.results.samples.shape)
    print(results12.samples.shape)

    # add a batch to the run 1
    sampler1.add_batch(mode='auto')
    # add two batches to the run 2
    sampler2.add_batch(mode='auto')
    sampler2.add_batch(mode='auto')

    # Attempt to merge again. This fails
    results123 = dynesty.utils.merge_runs([sampler1.results, sampler2.results])
    print(f"Bounds:\n {sampler1.results.batch_bounds}")
    print(sampler2.results.batch_bounds)

I hope this helps.

Cheers

Rodrigo Leiva

Now I create a unique batch list. I.e. if one run used [-inf, inf], [-5,5], [-4,4] and another [-inf, inf], [-5,5], [-3,2] the runs will be [-inf,inf] [-5,5] [-4, 4] [3,2] I also add a test that test for issue uncovered in #481

segasai · 2024-08-29T17:20:07Z

I have tentatively developed a fix ( #482 ) to this. Since it involves also some refactoring of the way merge is done, I'll think a bit more about it before committing (i.e. if it needs more tests).
Please try it out.

segasai · 2024-08-30T18:06:35Z

I have merged #482, so the issue should be now fixed in master branch.
Thanks for the report!

rodleiva added the bug this shouldn't be happening label Aug 28, 2024

segasai mentioned this issue Aug 29, 2024

Fix merging of runs #482

Merged

segasai closed this as completed Aug 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not possible to merge runs with different number of batches #481

Not possible to merge runs with different number of batches #481

rodleiva commented Aug 28, 2024 •

edited

Loading

segasai commented Aug 29, 2024

rodleiva commented Aug 29, 2024

segasai commented Aug 29, 2024

segasai commented Aug 30, 2024

Not possible to merge runs with different number of batches #481

Not possible to merge runs with different number of batches #481

Comments

rodleiva commented Aug 28, 2024 • edited Loading

segasai commented Aug 29, 2024

rodleiva commented Aug 29, 2024

segasai commented Aug 29, 2024

segasai commented Aug 30, 2024

rodleiva commented Aug 28, 2024 •

edited

Loading