Skip to content

Commit

Permalink
Merge branch 'main' into feature/allow_local_threading
Browse files Browse the repository at this point in the history
  • Loading branch information
AngelFP committed Nov 9, 2023
2 parents 52b4614 + 310e6cf commit 0ec1fd5
Show file tree
Hide file tree
Showing 5 changed files with 143 additions and 50 deletions.
30 changes: 22 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,27 +28,41 @@
</p>
</div>

Optimas is a Python library for scalable optimization on massively-parallel supercomputers. See the [documentation](https://optimas.readthedocs.io/) for installation instructions, tutorials, and more information.
Optimas is a Python library designed for highly scalable optimization, from laptops to massively-parallel supercomputers.


## Key Features

- **Scalability**: Leveraging the power of [libEnsemble](https://github.com/Libensemble/libensemble), Optimas is designed to scale seamlessly from your laptop to high-performance computing clusters.
- **User-Friendly**: Optimas simplifies the process of running large parallel parameter scans and optimizations. Specify the number of parallel evaluations and the computing resources to allocate to each of them and Optimas will handle the rest.
- **Advanced Optimization**: Optimas integrates algorithms from the [Ax](https://github.com/facebook/Ax) library, offering both single- and multi-objective Bayesian optimization. This includes advanced techniques such as multi-fidelity and multi-task algorithms.


## Installation
From PyPI
You can install Optimas from PyPI:
```sh
pip install optimas
```
From GitHub
Or directly from GitHub:
```sh
pip install git+https://github.com/optimas-org/optimas.git
```
Make sure `mpi4py` is available in your environment prior to installing optimas (see [here](https://optimas.readthedocs.io/en/latest/user_guide/installation_local.html) for more details).

Optimas is regularly used and tested in large distributed HPC systems.
We have prepared installation instructions for
Make sure `mpi4py` is available in your environment before installing optimas. Fore more details, check out the full [installation guide](https://optimas.readthedocs.io/en/latest/user_guide/installation_local.html). We have also prepared dedicated installation instructions for some HPC systems such as
[JUWELS (JSC)](https://optimas.readthedocs.io/en/latest/user_guide/installation_juwels.html),
[Maxwell (DESY)](https://optimas.readthedocs.io/en/latest/user_guide/installation_maxwell.html) and
[Perlmutter (NERSC)](https://optimas.readthedocs.io/en/latest/user_guide/installation_perlmutter.html).


## Documentation
For more information on how to use Optimas, check out the [documentation](https://optimas.readthedocs.io/). You'll find installation instructions, a user guide, [examples](https://optimas.readthedocs.io/en/latest/examples/index.html) and the API reference.


## Support
Need more help? Join our [Slack channel](https://optimas-group.slack.com/) or open a [new issue](https://github.com/optimas-org/optimas/issues/new/choose).


## Citing optimas
If your usage of `optimas` leads to a scientific publication, please consider citing the original [paper](https://link.aps.org/doi/10.1103/PhysRevAccelBeams.26.084601):
If your usage of Optimas leads to a scientific publication, please consider citing the original [paper](https://link.aps.org/doi/10.1103/PhysRevAccelBeams.26.084601):
```bibtex
@article{PhysRevAccelBeams.26.084601,
title = {Bayesian optimization of laser-plasma accelerators assisted by reduced physical models},
Expand Down
90 changes: 90 additions & 0 deletions doc/source/user_guide/basic_usage/running_with_simulations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -163,3 +163,93 @@ path to the ``executable`` that will run your simulation.
executable="/path/to/my_executable",
analysis_func=analyze_simulation,
)
Using a custom environment
~~~~~~~~~~~~~~~~~~~~~~~~~~

The ``env_script`` and ``env_mpi`` parameters allow you to customize the
environment in which your simulation runs.

``env_script`` takes the path to a shell script that sets up the
environment by loading the necessary dependencies, setting environment
variables, or performing other setup tasks required by your simulation.

This script will look different depending on your system and use
case, but it will typically be something like

.. code-block:: bash
#!/bin/bash
# Set environment variables
export VAR1=value1
export VAR2=value2
# Load a module
module load module_name
If the script loads a different MPI version than the one in the ``optimas``
environment, make sure to specify the loaded version with the ``env_mpi``
argument. For example:

.. code-block:: python
:emphasize-lines: 5,6
ev = TemplateEvaluator(
sim_template="template_simulation_script.txt",
executable="/path/to/my_executable",
analysis_func=analyze_simulation,
env_script="/path/to/my_env_script.sh",
env_mpi="openmpi",
)
See :class:`~optimas.evaluators.TemplateEvaluator` for more details.


Running a chain of simulations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The :class:`~optimas.evaluators.ChainEvaluator` is designed for use cases
where each evaluation involves several steps, each step being a simulation
with a different simulation code.

The steps are defined by a list of ``TemplateEvaluators`` ordered in the
sequence in which they should be executed. Each step can request a different
number of resources, and the ``ChainEvaluator`` gets allocated the maximum
number of processes (``n_procs``) and GPUs (``n_gpus``) that every step might
request.
For instance, if one step requires ``n_procs=20`` and ``n_gpus=0``, and a
second step requires ``n_procs=4`` and ``n_gpus=4``, each evaluation will
get assigned ``n_procs=20`` and ``n_gpus=4``. Then each step will only
make use of the subset of resources it needs.

Here is a basic example of how to use ``ChainEvaluator``:

.. code-block:: python
from optimas.evaluators import TemplateEvaluator, ChainEvaluator
# define your TemplateEvaluators
ev1 = TemplateEvaluator(
sim_template="template_simulation_script_1.py",
analysis_func=analyze_simulation_1,
)
ev2 = TemplateEvaluator(
sim_template="template_simulation_script_2.py",
analysis_func=analyze_simulation_2,
)
# use them in ChainEvaluator
chain_ev = ChainEvaluator([ev1, ev2])
In this example, ``template_simulation_script_1.py`` and
``template_simulation_script_2.py`` are your simulation scripts for the
first and second steps, respectively. ``analyze_simulation_1`` and
``analyze_simulation_2`` are functions that analyze the output of each
simulation. There is no need to provide an analysis function for every step,
but at least one should be defined.
58 changes: 17 additions & 41 deletions optimas/explorations/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ def __init__(
self.libe_comms = libe_comms
self._n_evals = 0
self._resume = resume
self._history_file_name = "exploration_history_after_evaluation_{}"
self._history_file_prefix = "exploration_history"
self._create_alloc_specs()
self._create_executor()
self._initialize_evaluator()
Expand Down Expand Up @@ -206,18 +206,6 @@ def run(self, n_evals: Optional[int] = None) -> None:
n_trials_final = self.generator.n_completed_trials
self._n_evals += n_trials_final - n_evals_initial

# Determine if current rank is master.
if self.libE_specs["comms"] == "local":
is_master = True
else:
from mpi4py import MPI

is_master = MPI.COMM_WORLD.Get_rank() == 0

# Save history.
if is_master:
self._save_history()

def attach_trials(
self,
trial_data: Union[Dict, List[Dict], np.ndarray, pd.DataFrame],
Expand Down Expand Up @@ -463,19 +451,6 @@ def _load_history(
if resume:
self._n_evals = history.size

def _save_history(self):
"""Save history array to file."""
filename = self._history_file_name.format(self._n_evals)
exploration_dir_path = os.path.abspath(self.exploration_dir_path)
file_path = os.path.join(exploration_dir_path, filename)
if not os.path.isfile(filename):
old_files = os.path.join(
exploration_dir_path, self._history_file_name.format("*")
)
for old_file in glob.glob(old_files):
os.remove(old_file)
np.save(file_path, self._libe_history.H)

def _create_libe_history(self) -> History:
"""Initialize an empty libEnsemble history."""
run_params = self.evaluator.get_run_params()
Expand All @@ -498,23 +473,17 @@ def _create_libe_history(self) -> History:

def _get_most_recent_history_file_path(self):
"""Get path of most recently saved history file."""
old_exploration_history_files = glob.glob(
os.path.join(
os.path.abspath(self.exploration_dir_path),
self._history_file_name.format("*"),
)
# Sort files by date and get the most recent one.
# In principle there should be only one file, but just in case.
exp_path = os.path.abspath(self.exploration_dir_path)
history_files = glob.glob(
os.path.join(exp_path, self._history_file_prefix + "*")
)
old_libe_history_files = glob.glob(
os.path.join(
os.path.abspath(self.exploration_dir_path),
"libE_history_{}".format("*"),
)
history_files.sort(
key=lambda f: os.path.getmtime(os.path.join(exp_path, f))
)
old_files = old_exploration_history_files + old_libe_history_files
if old_files:
file_evals = [int(file.split("_")[-1][:-4]) for file in old_files]
i_max_evals = np.argmax(np.array(file_evals))
return old_files[i_max_evals]
if history_files:
return history_files[-1]

def _set_default_libe_specs(self) -> None:
"""Set default exploration libe_specs."""
Expand Down Expand Up @@ -556,6 +525,13 @@ def _set_default_libe_specs(self) -> None:
# Ensure evaluations of last batch are sent back to the generator.
libE_specs["final_gen_send"] = True

# Save history file on completion and without date information in the
# name, so that it can be overwritten in subsequent calls to `run` or
# when resuming an exploration.
libE_specs["save_H_on_completion"] = True
libE_specs["save_H_with_date"] = False
libE_specs["H_file_prefix"] = self._history_file_prefix

# get specs from generator and evaluator
gen_libE_specs = self.generator.get_libe_specs()
ev_libE_specs = self.evaluator.get_libe_specs()
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ classifiers = [
'Programming Language :: Python :: 3.11',
]
dependencies = [
'libensemble @ git+https://github.com/libensemble/libensemble@develop',
'libensemble >= 1.1.0',
'jinja2',
'ax-platform >= 0.2.9',
'mpi4py',
Expand Down
13 changes: 13 additions & 0 deletions tests/test_exploration_resume.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ def test_exploration_in_steps():
assert exploration._n_evals == gen.n_completed_trials
assert exploration._n_evals == exploration.max_evals
assert exploration.history["gen_informed"].to_numpy()[-1]
assert count_history_files(exploration.exploration_dir_path) == 1


def test_exploration_in_steps_without_limit():
Expand Down Expand Up @@ -144,6 +145,18 @@ def test_exploration_resume():
assert exploration._n_evals == gen.n_completed_trials
assert exploration._n_evals == exploration.max_evals
assert exploration.history["gen_informed"].to_numpy()[-1]
assert count_history_files(exploration.exploration_dir_path) == 1


def count_history_files(exploration_dir):
""" "Count the number of history files in a directory."""
files = os.listdir(exploration_dir)
count = 0
for file in files:
file = str(file)
if file.endswith(".npy") and "_history_" in file:
count += 1
return count


if __name__ == "__main__":
Expand Down

0 comments on commit 0ec1fd5

Please sign in to comment.