Skip to content

Commit

Permalink
Merge pull request #7 from UrbsLab/main
Browse files Browse the repository at this point in the history
Merging readme updates from main
  • Loading branch information
raptor419 authored Aug 13, 2024
2 parents 3eaa2e1 + b69c101 commit f89c9d6
Show file tree
Hide file tree
Showing 4 changed files with 57 additions and 13 deletions.
36 changes: 32 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,11 @@ The schematic below summarizes the automated STREAMLINE analysis pipeline with i

* A simple demonstration of STREAMLINE on example biomedical data in our ready-to-run Google Colab Notebook [here](https://colab.research.google.com/drive/14AEfQ5hUPihm9JB2g730Fu3LiQ15Hhj2?usp=sharing).

* A video tutorial playlist covering all aspects of STREAMLINE is available [here](https://www.youtube.com/playlist?list=PLafPhSv1OSDcvu8dcbxb-LHyasQ1ZvxfJ)

### YouTube Overview of STREAMLINE
[![IMAGE ALT TEXT HERE](https://img.youtube.com/vi/xVc4JEbnIs8/0.jpg)](https://www.youtube.com/watch?v=xVc4JEbnIs8)

### Pipeline Design
The goal of STREAMLINE is to provide an easy and transparent framework
to reliably learn predictive associations from tabular data with a particular focus on the needs of biomedical data applications.
Expand Down Expand Up @@ -45,10 +50,11 @@ and regression outcome data.
most recent release of STREAMLINE before use. We are actively updating this software as feedback is received.

### Publications and Citations
The first publication detailing STREAMLINE (release Beta 0.2.4) and applying it to
simulated benchmark data can be found [here](https://link.springer.com/chapter/10.1007/978-981-19-8460-0_9).
The most recent publication on STREAMLINE (release Beta 0.3.4) with benchmarking on simulated data and application to investigate obstructive sleep apena risk prediction as a clinical outcome is available as a preprint on arxiv [here](
https://doi.org/10.48550/arXiv.2312.05461).

This paper is also available as a preprint on arxiv, [here](https://arxiv.org/abs/2206.12002?fbclid=IwAR1toW5AtDJQcna0_9Sj73T9kJvuB-x-swnQETBGQ8lSwBB0z2N1TByEwlw).
The first publication detailing the initial implementation of STREAMLINE (release Beta 0.2.4) and applying it to
simulated benchmark data can be found [here](https://link.springer.com/chapter/10.1007/978-981-19-8460-0_9), or as a preprint on arxiv, [here](https://arxiv.org/abs/2206.12002?fbclid=IwAR1toW5AtDJQcna0_9Sj73T9kJvuB-x-swnQETBGQ8lSwBB0z2N1TByEwlw).

See [citations](https://urbslab.github.io/STREAMLINE/citation.html) for more information on citing STREAMLINE, as well as publications applying STREAMLINE and publications on algorithms developed in our research group and incorporated into STREAMLINE.

Expand Down Expand Up @@ -111,6 +117,28 @@ We welcome ideas, suggestions on improving the pipeline, [code-contributions](ht

* For questions on the code-base, installing/running STREAMLINE, report bugs, or discuss other troubleshooting issues; contact Harsh Bandhey at [email protected].

# Other STREAMLINE Tutorial Videos on YouTube
### A Brief Introduction to Automated Machine Learning
[![IMAGE ALT TEXT HERE](https://img.youtube.com/vi/IjX0phz3LLE/0.jpg)](https://www.youtube.com/watch?v=IjX0phz3LLE)

### A Detailed Walkthrough
[![IMAGE ALT TEXT HERE](https://img.youtube.com/vi/sAB8d1KnMDw/0.jpg)](https://www.youtube.com/watch?v=sAB8d1KnMDw)

### Input Data
[![IMAGE ALT TEXT HERE](https://img.youtube.com/vi/5HnangrEF5E/0.jpg)](https://www.youtube.com/watch?v=5HnangrEF5E)

### Run Parameters
[![IMAGE ALT TEXT HERE](https://img.youtube.com/vi/qMi9vhVag-4/0.jpg)](https://www.youtube.com/watch?v=qMi9vhVag-4)

### Running in Google Colab Notebook
[![IMAGE ALT TEXT HERE](https://img.youtube.com/vi/nknyJWhm7pg/0.jpg)](https://www.youtube.com/watch?v=nknyJWhm7pg)

### Running in Jupyter Notebook
[![IMAGE ALT TEXT HERE](https://img.youtube.com/vi/blat3gAfUaI/0.jpg)](https://www.youtube.com/watch?v=blat3gAfUaI)

### Running From Command Line
[![IMAGE ALT TEXT HERE](https://img.youtube.com/vi/-5yjGxnJ7eI/0.jpg)](https://www.youtube.com/watch?v=-5yjGxnJ7eI)

***
# Acknowledgements
The development of STREAMLINE benefited from feedback across multiple biomedical research collaborators at the University of Pennsylvania, Fox Chase Cancer Center, Cedars Sinai Medical Center, and the University of Kansas Medical Center.
Expand All @@ -123,4 +151,4 @@ We also thank the following collaborators for their feedback on application
of the pipeline during development: Shannon Lynch, Rachael Stolzenberg-Solomon,
Ulysses Magalang, Allan Pack, Brendan Keenan, Danielle Mowery, Jason Moore, and Diego Mazzotti.

Funding supporting this work comes from NIH grants: R01 AI173095, U01 AG066833, and P01 HL160471.
Funding supporting this work comes from NIH grants: R01 AI173095, U01 AG066833, and P01 HL160471.
28 changes: 22 additions & 6 deletions docs/source/citation.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,22 @@
# Citing STREAMLINE

If you use STREAMLINE in a scientific publication, please consider citing the following paper as well as noting the *release* applied within the manuscript (i.e. the Beta 0.2.4 release was applied in the publication below):
If you use STREAMLINE in a scientific publication, please consider citing the following paper as well as noting the *release* applied within the manuscript.

The most recent release (Beta 0.3.4) was applied in the most recent pre-print below:

[Urbanowicz, Ryan, et al. "STREAMLINE: An Automated Machine Learning Pipeline for Biomedicine Applied to Examine the Utility of Photography-Based Phenotypes for OSA Prediction Across International Sleep Centers." arXiv preprint arXiv:2312.05461.](https://doi.org/10.48550/arXiv.2312.05461)

BibTeX Citation:
```
@article{urbanowicz2023streamlineosa,
title={STREAMLINE: An Automated Machine Learning Pipeline for Biomedicine Applied to Examine the Utility of Photography-Based Phenotypes for OSA Prediction Across International Sleep Centers},
author={Urbanowicz, Ryan J and Bandhey, Harsh and Keenan, Brendan T and Maislin, Greg and Hwang, Sy and Mowery, Danielle L and Lynch, Shannon M and Mazzotti, Diego R and Han, Fang and Li, Quing Yun and Penzel, Thomas and Tufik, Sergio and Bittencourt, Lia and Gislason, Thorarinn and de Chazal, Philip and Singh, Bhajan and McArdle, Nigel and Chen, Ning-Hung and Pack, Allan and Schwab, Richard J and Cistulli, Peter A and Magalang, Ulysses J},
journal={arXiv preprint arXiv:2312.05461},
year={2023}
}
```

The first STREAMLINE publication (Beta 0.2.4 release was applied in the publication below):

[Urbanowicz, Ryan, et al. "STREAMLINE: A Simple, Transparent, End-To-End Automated Machine Learning Pipeline Facilitating Data Analysis and Algorithm Comparison." Genetic Programming Theory and Practice XIX. Singapore: Springer Nature Singapore, 2023. 201-231.](https://link.springer.com/chapter/10.1007/978-981-19-8460-0_9)

Expand All @@ -16,15 +32,15 @@ BibTeX Citation:
}
```

If you wish to cite the STREAMLINE codebase instead, please use the following (indicating the release used in the link, for example, v0.2.5-beta):
If you wish to cite the STREAMLINE codebase instead, please use the following (indicating the release used in the link, for example, v0.3.4-beta):
```
@misc{streamline2022,
@misc{streamline2023,
author = {Urbanowicz, Ryan and Zhang, Robert},
title = {STREAMLINE: A Simple, Transparent, End-To-End Automated Machine Learning Pipeline},
year = {2022},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/UrbsLab/STREAMLINE/releases/tag/v0.2.5-beta} }
howpublished = {\url{https://github.com/UrbsLab/STREAMLINE/releases/tag/v0.3.4-beta} }
}
```
## STREAMLINE Applications
Expand Down Expand Up @@ -101,7 +117,7 @@ A [preprint](https://arxiv.org/abs/2008.12829) describing an early version of wh
}
```

The STREAMLINE [preprint](https://arxiv.org/abs/2206.12002).
The STREAMLINE (v0.2.4) [preprint](https://arxiv.org/abs/2206.12002).
```
@article{urbanowicz2022streamline,
title={STREAMLINE: A Simple, Transparent, End-To-End Automated Machine Learning Pipeline Facilitating Data Analysis and Algorithm Comparison},
Expand Down
4 changes: 2 additions & 2 deletions docs/source/running.md
Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,7 @@ The following commands can be run one after the other (in sequence), waiting for

###### Phase 1 - Data Exploration & Processing:
```
python run.py --do-eda --data-path ./data/DemoData --out-path DemoOutput --exp-name demo_experiment --class-label Class --inst-label InstanceID --cf ./data/DemoFeatureTypes/hcc_cat_feat.csv --qf ./data/DemoFeatureTypes/hcc_quant_feat.csv --cv 3 --algorithms NB,LR,DT --run-cluster False --run-parallel True
python run.py --do-eda --data-path ./data/DemoData --out-path DemoOutput --exp-name demo_experiment --class-label Class --inst-label InstanceID --cf ./data/DemoFeatureTypes/hcc_cat_feat.csv --qf ./data/DemoFeatureTypes/hcc_quant_feat.csv --cv 3 --run-cluster False --run-parallel True
```
###### Phase 2 - Imputation and Scaling:
```
Expand Down Expand Up @@ -346,7 +346,7 @@ The following commands can be run one after the other (in sequence), waiting for

###### Phase 1 - Data Exploration & Processing:
```
python run.py --do-eda --data-path ./data/DemoData --out-path DemoOutput --exp-name demo_experiment --class-label Class --inst-label InstanceID --cf ./data/DemoFeatureTypes/hcc_cat_feat.csv --qf ./data/DemoFeatureTypes/hcc_quant_feat.csv --cv 3 --algorithms NB,LR,DT --run-cluster SLURM --res-mem 4 --queue defq
python run.py --do-eda --data-path ./data/DemoData --out-path DemoOutput --exp-name demo_experiment --class-label Class --inst-label InstanceID --cf ./data/DemoFeatureTypes/hcc_cat_feat.csv --qf ./data/DemoFeatureTypes/hcc_quant_feat.csv --cv 3 --run-cluster SLURM --res-mem 4 --queue defq
```
###### Phase 2 - Imputation and Scaling:
```
Expand Down
2 changes: 1 addition & 1 deletion streamline/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,6 @@
Pipeline for Supervised Learning in Tabular Binary Classification Data
"""

__version__ = "0.3.3"
__version__ = "0.3.4"
__author__ = 'Harsh Bandhey and Ryan Urbanowicz'
__credits__ = 'UrbsLabs'

0 comments on commit f89c9d6

Please sign in to comment.