Document somewhere that order of parameters matters for reproducibility #61

Saethox · 2023-09-18T20:16:54Z

I lost several hours trying to find out why irace wasn't being deterministic despite setting a seed, and it turns out that specifying the parameter space in a different order results in different values. Not a good combination with dictionaries that have a non-stable hash function... :/

Correct me if I'm wrong, but I don't see this mentioned anywhere in the documentation.

This is most certainly a niche problem, but given that I find this behavior not obvious, documenting it somewhere seems like a good idea.

The text was updated successfully, but these errors were encountered:

MLopez-Ibanez · 2023-09-19T11:00:23Z

Yes, the order of the parameters affects the order in which they are sent to the target-runner and it also affects the order in which the parameters are sampled. This is important for some applications where the target-runner expects the parameters to have a particular order and the user can specify this in the parameters table. Also, it seems unavoidable that the order of the parameters at the same level of the dependency hierarchy has an effect on the results. The question is what order should be used and the order provided by the user is as good as (perhaps better than) any other order that I can think of.

In R, named lists have a stable order. I believe iracepy uses an OrderedDict for parameters, so the order should also be stable, no?

Nevertheless, more than happy to document this behaviour. Where do you think this should be documented? We have the user-guide (vignette) and the documentation within R. I am happy to merge a pull request.

MLopez-Ibanez · 2023-09-19T11:10:28Z

Just to note that I do not believe this is surprising behaviour in an optimization procedure. The order (either positional or according to name) of decision variables will affect a single run of almost any optimization procedure, even some deterministic ones.

On the other hand, one may hope that it doesn't have an effect on expectation over many runs. Otherwise, it may be worth figuring out why this is the case and fixing it (or randomizing the order for each run, which will not fix the problem with single runs as the random order will depend on the initial order, but it will fix it over many runs). This is the case for mathematical programming solvers: https://pubsonline.informs.org/doi/10.1287/opre.2013.1231

Saethox · 2023-09-19T14:55:29Z

In R, named lists have a stable order. I believe iracepy uses an OrderedDict for parameters, so the order should also be stable, no?

That's correct. Unfortunately, the default RandomState of Rust's HashMap is initialized with random keys, which results in different hash values in each execution, and a different iteration order. Easy to fix, but not so easy to identify as the issue.

Where do you think this should be documented?

Maybe a sentence on reproducibility in the documentation of the irace function, and in the user guide FAQ. Something along the lines of this:

Are experiments with irace reproducible?

An irace run with the same seed, scenario and parameters will yield identical results. Note that the order of parameters and instances matters.

MLopez-Ibanez added enhancement New feature or request help wanted Extra attention is needed labels Sep 19, 2023

MLopez-Ibanez closed this as completed in debb5f5 Sep 19, 2023

MLopez-Ibanez added a commit that referenced this issue Sep 20, 2023

Fix #61: More documentation about reproducibility.

acc5295

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document somewhere that order of parameters matters for reproducibility #61

Document somewhere that order of parameters matters for reproducibility #61

Saethox commented Sep 18, 2023

MLopez-Ibanez commented Sep 19, 2023

MLopez-Ibanez commented Sep 19, 2023

Saethox commented Sep 19, 2023

Document somewhere that order of parameters matters for reproducibility #61

Document somewhere that order of parameters matters for reproducibility #61

Comments

Saethox commented Sep 18, 2023

MLopez-Ibanez commented Sep 19, 2023

MLopez-Ibanez commented Sep 19, 2023

Saethox commented Sep 19, 2023