Skip to content

Distributed DataStep pipeline to process 3D microscopy images, perform large-scale statistical analysis, and generate summary visualizations for the data.

License

Notifications You must be signed in to change notification settings

AllenCellModeling/polar_express

Repository files navigation

polar_express

Build Status Documentation Code Coverage

Polar Express: 3D Image Analysis Pipeline


A DataStep pipeline to process 3D single-cell microscopy images, perform large-scale statistical analysis on GFP intensity distribution for target organelles, and generate summary visualizations for the data.

Features

All steps and functionality in this package can be run as single steps or all together by using the command line.

In general, all commands for this package will follow the format: polar_express {step} {command}

  • step is the name of the step, such as "selectdata" or "computecellmetrics"
  • command is what you want that step to do, such as "run" or "push"

Example datasets can be accessed here.

Pipeline

To run the entire pipeline from start to finish you can simply run:

polar_express all run --dataset {path to dataset}

Step specific parameters can additionally be passed by simply appending them. For example: the step computecellmetrics has a parameter for num_angular_compartments and this can be set on both the individual step run level and also for the entire pipeline with:

polar_express all run --dataset {path to dataset} --num_angular_compartments {integer}

See the steps module in our documentation for a full list of parameters for each step

Pipeline Config

A configuration file can be provided to the underlying datastep library that manages the data storage and upload of the steps in this workflow.

The config file should simply be called workflow_config.json and be available from whichever directory you run polar_express from. If this config is not found in the current working directory, defaults are selected by the datastep package.

Here is an example of our production config:

{
    "quilt_storage_bucket": "s3://allencell",
    "project_local_staging_dir": "/allen/aics/modeling/william.chen/results/polar_express"
}

You can even additionally attach step-specific configuration in this file by using the name of the step like so:

{
    "quilt_storage_bucket": "s3://example_config_7",
    "project_local_staging_dir": "example/config/7",
    "example": {
        "step_local_staging_dir": "example/step/local/staging/"
    }
}

Individual Steps

  • polar_express selectdata run --dataset {path to dataset}, Load 3D cell images filtered for the target structure or organelle, or alternatively, generate artificial cells.
  • polar_express computecellmetrics run, Compute and save key metrics relating to the structure of the cell and GFP distribution.
  • polar_express gathertestvisualize run, Generate visualizations for the compartmentalized GFP distributions, including polar heatmaps, and run statistical tests over the entire dataset.

Installation

Install Requires: The python package, numpy, must be installed prior to the installation of this package: pip install numpy

Stable Release: pip install polar_express
Development Head: pip install git+https://github.com/AllenCellModeling/polar_express.git

Documentation

For full package documentation please visit allencellmodeling.github.io/polar_express.

Development

See CONTRIBUTING.md for information related to developing the code.

For more details on how this pipeline is constructed please see cookiecutter-stepworkflow and datastep.

To add new steps to this pipeline, run make_new_step and follow the instructions in CONTRIBUTING.md

Developer Installation

The following two commands will install the package with dev dependencies in editable mode and download all resources required for testing.

pip install -e .[dev]
python scripts/download_test_data.py

Free software: Allen Institute Software License

About

Distributed DataStep pipeline to process 3D microscopy images, perform large-scale statistical analysis, and generate summary visualizations for the data.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published