Skip to content

The mildly ominous parameter exploration toolkit πŸ›΅ - Run huge simulations on distributed clusters, because why not? 🀯

License

Notifications You must be signed in to change notification settings

caglorithm/mopet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Build 10.5281/zenodo.3941539 Release codecov Code style: black

mopet πŸ›΅

The mildly ominous parameter exploration toolkit

Isn't it strange that, although parameter explorations are a crucial part of computational modeling, there are almost no Python tools available for making your life easier? mopet is here to help! You can run extensive grid searches in parallel (powered by ray) and store extremely huge amounts of data into a HDF file (powered by pytables) for later analysis - or whatever your excuse is for buying yet another hard disk.

Installation πŸ’»

The easiest way to get going is to install the pypi package using pip:

pip install mopet

Alternatively, you can also clone this repository and install all dependencies with

git clone https://github.com/caglorithm/mopet.git
cd mopet/
pip install -r requirements.txt
pip install .

Example usage 🐝

Feel free to have a look at the Documentation page. Setting up an exploration is as easy as can be!

# first we define an toy evaluation function
def distance_from_circle(params):
	# let's simply calculate the distance of
	# the x-y parameters to the unit circle
	distance = abs((params["x"] ** 2 + params["y"] ** 2) -

	# we package the result into a dictionary
	result = {"result" : distance}
	return result

Let's set up the exploration by defining the parameters to explore and passing the evaluation function from above:

import numpy as np
import mopet

explore_params = {"x": np.linspace(-2, 2, 21), "y": np.linspace(-2, 2, 21)}
ex = mopet.Exploration(distance_from_circle, explore_params)

Running the exploration is in parallel and is handled by ray. You can also use a private cluster or cloud infrastructure, see here for more info.

ex.run()
>> 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 441/441 [426.57it/s]

After your exploration has finished, you will find a file exploration.h5 in your current directory with all the runs, their parameters and their outputs, neatly organized. If you open this file (with HDFView for example), you'll see something like this:

Build

Loading exploration results

You can load the exploration results using

ex.load_results(arrays=True)

Note that using arrays=True will load all results into memory (as opposed to just the parameters of each run). Please make sure that you have enough free memory for this since your simulation results could be huge. If you do not want this, you can load individual results using their run_id (which is an integer counting up one per run):

ex.get_run(run_id=0)

After using ex.load_results(), an overview of all runs and their parameters is given as a pandas DataFrame, available as ex.df. Using ex.load_results() with the default parameters will automatically aggregate all scalar results into this table, like distance in our example above, which is a float.

Using some fancy pivoting, we can create a 2D matrix with the results as entries

pivoted = ex.df.pivot_table(values='result', index = 'y', columns='x', aggfunc='first')

Let's plot the results!

import matplotlib.pyplot as plt
plt.imshow(pivoted, \
           extent = [min(ex.df.x), max(ex.df.x),
                     min(ex.df.y), max(ex.df.y)], origin='lower')
plt.colorbar(label='Distance from unit circle')
plt.xlabel("x")
plt.ylabel("y")

More information πŸ““

Inspired by πŸ€”

mopet is inspired by pypet, a wonderful python parameter exploration toolkit. I have been using pypet for a very long time and I'm greatful for its existence! Unfortunately, the project is not maintained anymore and has run into several compatibility issues, which was the primary reason why I built mopet.

Built With πŸ’ž

mopet is built on other amazing open source projects:

  • ray - A fast and simple framework for building and running distributed applications.
  • pytables - A Python package to manage extremely large amounts of data.
  • tqdm - A Fast, Extensible Progress Bar for Python and CLI
  • pandas - Flexible and powerful data analysis / manipulation library for Python
  • numpy - The fundamental package for scientific computing with Python

About

The mildly ominous parameter exploration toolkit πŸ›΅ - Run huge simulations on distributed clusters, because why not? 🀯

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages