Skip to content

Commit

Permalink
Merge branch 'develop' into 'master'
Browse files Browse the repository at this point in the history
Release v0.4.0

See merge request tron/addannot!116
  • Loading branch information
Pablo Riesgo Ferreiro committed Dec 14, 2020
2 parents af3831e + c3ead68 commit fa010e9
Show file tree
Hide file tree
Showing 125 changed files with 32,747 additions and 3,598 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,5 @@ neofox.egg-info/
neofox/tests/resources/output*
**/dask-worker-space
docs/build
.ipynb_checkpoints
neofox-dask-report*
5 changes: 4 additions & 1 deletion MANIFEST.in
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,7 @@ include neofox/published_features/Tcell_predictor/amino-acids-features.pickle
include neofox/published_features/Tcell_predictor/genes-expression.pickle
include neofox/published_features/Tcell_predictor/SIRdata.mat
include neofox/published_features/Tcell_predictor/Classifier.pickle
include neofox/references/install_r_dependencies.R
include neofox/references/install_r_dependencies.R
include neofox/expression_imputation/tcga_cohort_code.tab
include neofox/expression_imputation/tcga_exp_summary_modified.tab.gz.tbi
include neofox/expression_imputation/tcga_exp_summary_modified.tab.gz
180 changes: 107 additions & 73 deletions README.md

Large diffs are not rendered by default.

Binary file added docs/figures/figure1_v3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/figures/neofox_model.drawio
Original file line number Diff line number Diff line change
@@ -1 +1 @@
<mxfile host="app.diagrams.net" modified="2020-09-23T08:52:24.028Z" agent="5.0 (X11)" etag="2QvKU90HMuFHNitT39tW" version="13.6.9" type="device"><diagram id="t43y8l8tPbXxPeGLrGI9" name="Page-1">7Zpbc5s4FMc/jR/b4WIufnRubXfSTHY7O22fGBkEaCMQK+TE7qdfAQIEwrfEOM06TR+sowtI58dfOgcm5mWy+kRBFn8lAcQTQwtWE/NqYhj61DAmxX8tWAuLNXMqS0RRIGyt4Rv6BYVRE9YlCmDeacgIwQxlXaNP0hT6rGMDlJKnbrOQ4O5VMxBBxfDNB1i1fkcBi4VV17S24jNEUSwu7VqiIgF1Y2HIYxCQJ8lkXk/MS0oIq34lq0uIi9Wr16Xqd7OhtrkxClO2Twf4JVri73+Y7nyehH8uLrzwb/xBn4qbY+t6xjDgCyCKhLKYRCQF+Lq1XlCyTANYDKvxUtvmlpCMG3Vu/AcythbeBEtGuClmCRa1cIXYD+n3z2Koj5YoXa3EyGVhXRdSRtc/5ILUqyi23cpS3S8kKbsBCcKF4ZIsKYKUT/gOPpVNg3kBCa9KSQoryw3CWHQOQB6XEy3uNGeUPDQUGMXYvOUlwYSWK2eGVvHXtJRq7PIfr4koCBC/QaluUf7xusoTxfJv9LAw5XwaPtzmVvGkABpBtq2d0YDIH2FIEsiXjnekEAOGHrs3AsSjFDXtWtr4DwHcIfBV4z4CvBRXuoMEpAxFMFWwzJ9QgkHpIz9GOLgFa7Is5pYz4D/UpYuYUPSLuxzgxmuAMkGiqXVafCt6Ck+3TOsCGtGHPyGiLO5FG0SBwpy3v699pTWmW5Cz+lZqBeigVRQARlHKf/u8N8dzB7UDEDkQ2FBTkQwAdEN/EEnfhYuwwe4RUgZX28FTOWl0XmhcrfK2KD+1kmnUihlLaqmb2khsGQpbfB/haIXlQtp8AVZzPvmMY87N3mCljwtVQ2ztwZwhLuiIpFI1TYEHVxl3c65WPAKKOMoewBhi6IUU/ruEqb+WmgX7NVOehBoWDEOmeFbImIyBMOUZ8FEa3Za9rqY9yguGCwYQ3/rm4gKskPMKY7Bo2KeEASaVOSY14HuhtFkKVL4ET+a+OGlj4WQqOH2CfFXfRertiJTZE6mmLFHVCJdM1WwsqKYKVCDPYbLA8uPP90Lo5etkQbBkZRSkuU9R1pWuM1AK81ClGPSpPZZPLcWnX5dM7BzvYvFmxcLaFyzdGIssWyErIznqnUmeOEceW2fQ40udEv4cB1JtUqAIA2+VdM45hTx4IUfxgT/1HoVR7zhTzG5L/VnIjnUU2RltK5kpTjhhGN+G7j+lmsPCeONFcbyqXlJkf/3XUGi/LRI/PK6fHC92F5HLzth9OkzkiUJ37VWAGwWe52D8DODKbbvfpDT+n6i0X5VKNaOkYqqk/DpJPom2gXzf4RhpWzFqgW7zmA1823h+64lHdxgTaf+0BrbP2rY3TeIK9wTxiTSHO8voHu6m/TNbNU3RS06p9wdyeiGl0xuoWgdloBLuZtrP591RcL+v8lnvscZBscbMCTTHGYg1LOgG06HnyjUWpiSxL8uemlaHog/WQLpLP2ms4SpgDSZIUe4VuU7wCBAuD9xSqBH73heR1sz79qGKs4ginEOjiEG/jxZF6GrWvH0jM09TUmUy8nd5OUhewjCYlWLR37bD0PAHUxmBvbCtI8mLbXc3Kd22VMzck8qLrqbT04az4XcxoKKPDNo8vkC9dzAMJZDTlWRyIgNWG3vuxdy/56Y9m971bhGfISrGEx81wdVKzrvivCHFcXuKY5qq4pw2PaarJ+UUJPJ5pao5N0nYEK3/Li9TdFdxw++cZeqkgBKQrgdyQAelFI4b5u+VgJocMRcgdvndXyFtOBafKGk0U7SBUzDXmxvgB94YZSqJasLvBcmjt5Iq3BzFdiTfVaVjxISO43QjaMt6ZkKn+Xpz00AbEjrqQNrgctTDkDDM4SgpofrMvmcKtNGps0T3lejs0/BsOvsDjZxurFHqKeXdTqXkrrkFC4i7nCmn3P5JKEFBUG3pvcNQVsywnLN1MbGudhG68YAkPhoXI0+aT7V3HpykD7iHtFD7aGq2cxR4dM1+uXTwYvudedW8/VzfvP4P</diagram></mxfile>
<mxfile host="app.diagrams.net" modified="2020-12-10T21:16:17.838Z" agent="5.0 (X11)" etag="aGAbwFJlYRFj5Cy99uKH" version="14.0.0" type="device"><diagram id="t43y8l8tPbXxPeGLrGI9" name="Page-1">7V1bd9q4Gv01PMLyBV94hJB0pitt02Z62nliCSyDJ8ZibJNAf/2RjOSLJBs72CRlaLqCJcuy0Le1tfXpkp5+s959CMFm9Qk50O9pirPr6dOepqlDTeuR/4qzpzGmphxilqHn0Lgs4tH7BWkkS7b1HBgVEsYI+bG3KUYuUBDARVyIA2GIXorJXOQX37oBSyhEPC6AL8b+8Jx4RWNVRclu/AG95Yq+2jbojTVgiWlEtAIOeslF6bc9/SZEKD5crXc30Ce1x+rl8Nxdyd20YCEM4joPTFd/ffxf/8d2M3rZKRMYfN/9Pe6zan4G/pZ+Y1raeM+qYBmi7UZ8Gy3AMwxjuJPZAsxZDtnXxUCBaA3jcI/T0adYjVGI9IcjGvGSVbhm0rhVrq5NZgRAjbxM887qAV/QqmhSLcdrBVdK4ECSidLTJy8rL4aPG7Agd19wW8Bxq3iNXzpV8aXr+f4N8lGYPKs7ANruAsdHcYieYO6OubDh3CV3GFrI48sQOB6u+FxCN/mH74mGqTY0by7RLGepdvjncuv/+Kjb4/Ha/TqfzNzvvqTaP0MEgthbwkAwQPTirX0Q4NBksfJ85x7s0ZYUMYrB4omFJisUer9QEANmC3w7jCnR6EohxSN5klq0aF8X36fPqEMWpmVRUkMyhtDI8zDC6R+YXZQ06h5EMStK3sgOiFbp+4DvLQN8vcBPw5C+8A6sPZ+0khu0DT0crSmf4YscHxYEJlReBz1asfdekH69JxgvVjSw9EEU0ev20Ufv6lqBFGSUoEiwqepdYVMTsIk7JgxNNzGEiStwN8YVsgExscRMehPDGOaCYQBmcLfBsIg8FHA3nkHoYejPgO9DH87cEP67hcFin0vm1EsmtBwGLh+6sYCEACVNKg8bGhVhevOC5X3y1HTItQqCeWJeD/edY/qCGG0Y7GlncGhZMYhzYYwe1iBEOFXyxHE0UfjodeHTGbUZAnw+bXE1JHa/EtuZiC3PYp0xl6rqBerSbEsAn6pJwGd1hT1TwN4LhtUs3m/gbLcuMNSagBI6NPrSeMOobcty3pCabtiZIlIEK0AHD1BoEIXxCi1RAPzbLJZr4Vmae0TqNamwf2Ac76kJwDZGRbmKazHc/yTPDwwW/JtmlwSmu0JonyOT0rYMd158yNNSDBpOMh2MRjSYZUsCLFeRgmDgjMnYDgdvvx1AhqPuPFK39BGMDT5JEplLJCGXefIjkotrkB8puST/aJEfYOhhuxNeew3HRLjGFrACDVSD4O+xhDVwTpBSifIQ+rgLei4OdWUYTh7F9Qn2uQQb5AVxlMv5gUTkiNDiiNDkBqjN0uOLQwmy1pR+ldeP9IaSkZ7pxxTMxGMAqEnMf7dkqD4p4jqNxldL8nnvReQx5OJfQTp6iViuuJCHjA+phdaNiSkuNsa6NMuxM8+pa89xEnaQDVUlmoAWSeXo2uy06yx6AtSaqk3ryg0gdpy3O9y6MZWStwUBuiq4S1RwZhGGQ02iAuyOBJz0K1gCDgOwzo8kD3cuZMhX2RRPkm4yo7Uh3aQlHr2JcktFlpbXWOyWXGDl5J6qaU0EXxLiVU8t8VWQcL9giP5Cn0CwLyg5sffD+LYV8iPltRP7qaPyi80/5ORXFVbbk18n4VB0XDEt4njP5IW05WZSJiEJQdiwp+YhH4OLlWRVImlO8pQb0HaGUhxocz0R3jy1vd5xXt+txJyUnGCRTl2YqoR1WvGh77Z/fPzQ/xlHT8r3x9vpyJ//E/Q1+4i9M8swC5N204+ShjPGCXR9s8ubX4IQ2veXYoRq5+avkqjttbfANx4B0dDKp0fxpZ9vv9x9+VkqsGth9Agqi6DK+b8LgoQfNALTVkoVE5N5hqig+G8sQf+NpRsKU38ydHeMfG7OTpVIdduS4N7uqrcV/asn8hDXl3Asw5vadbWFVLo65tw0zDPzEm8dCStJrZOOuNr3Y9lvooZSYVMQNVQbVeqanDpZ87qkXH2UDoOa+5OqvFP1FBaVgkz75b96qRQsBd9xcaSL6kie0HoreSQvzkhgDlxjYzUtAAqilbcR4Sv6IAWnTW14vLn3stTuR61UXPqxL9J8jnAMCd+wuGZ+SMFxaI60gTYqFGBocEx2wC99sMIJaZnW0bwOED+el61IK4Vlg1w3goUcXuHUlPuJa6zqkVHcfxLAb4dRHhCnYJTPqwSjrSFMHFUS1vx8lDWxNe7BHPpFtAkKudR7zXuW6CQEztyY9IzpMZyKMKluQCcvblMGumJarUBIVcwOCEQ+iBQ934IpsaDdkEtvnSyhzJtT7hRksYn9H1DkJb5zfTpHcYzWxO9AbkzA4mmZKD+ZOE5eNo42h6WeyeCKBVxvR7TihJZnuopjskZ0TCpHu1s4gTrwsPFcD2vKcLDAb8RDQhAD/EHi8VjrDleot4AY3KRGR9hAdx9gAEPgz74kd2aG1Vc1e7AJloI8PT5ElLg3rNuxeVupu+qLfpuN/NMxmS10wuZQJDgW176yEjkiW843TmdPIhFZ1+mTiukT13VGSZf3ijFo9fRJY8jxE3ZpZ5mfsZM53btbZ8WGIfm5khR08lV5dCIPSeNmuDa41Xmxt4YYauvNpUy4VLfekxbZdTVNJi+y2GuNr5O0vz3LmHpR62qGIQBNlS1Ub8PZWOnC+I9MyFY3tpNmZGVm44fbrZlNtNrDYbX2lRsaccPIchTLErmhxmRdgRskGw2q9xY0n6DTiwKlP9Qk3CEb9nfGHeL0nFSUeNGM7AIAz8Dzk3Z6yVxi1bZrBZVUOW/an2UVrPhptVAFy1x55DJ4JKUNxiO6KvKIxuLa5hEpAiXjHCJBbrTeWGntt6b0jAmZqO9ZuJTKJL26Ib62jJJ+7ZfEn0NqLHuy7CKX9Yq4r2iWK+KRYddw7aUBH2FTGtPfmvYqSeSkEZYMdJ3Rnth5YdqTuAavtHeZtKcpEtqTbYgxu0KgOFvbKu0lrDR9SLln+jW7/HaRJGTXBkUFCckg0MZOXGmJJWcWEBb6kGyuvTLRZTIRN5eOQSBhoq5WnMlhKJ5W0JCKKqXR9GGsYuKZPkySj6+H0NdD6Bv+kAuwVxFeMxl2GQPRamo5iQ1lMOxMkqniDjbChn9GyEXh+kqIF0uIRWkmGY9KPSKdCTNVXIHL+cQvkC6Gtc3W0HHVmXiSssU4OTzkyhUXyhUWt6Gxr+k12aKVeXopDkW2cLe+P+Mogwty5+fQg8kumGA645c25uClJbYEI7zndf7NtyrW3AxQvm2xtD0fXWFPR+r5BfZVnuHzr6+XOzn194SIswKi3q7V7gFhtw2I5NGmx4EMFc6ZqFoGh67TzveQfnfzbdHX7PycOsCqgc8mdFSyS6hXuV+81lE3EhRX0dVRFA87QbEIU7bwLYNpvWX6bSFWFSdZDof7iQtlz7CavqZQKVsM1Hi9fF8ZqIpRXPDFJspOtGxf52czijl0t5xefVd94HlVUb1OMDu3QrW1HBWpA1XXz0hGo5pkxJy/b92nmhykTa37LlV94z71d0IzobI8msleoHeI5tYPjDvNlSiuKLgCrARgtmbmAIYBp56TLtl8yfnF22kAE1cCv3+JVboh4xV7EjXL5NbUt6KwzIEyLGY7MM6lsdI/s3BW0mje9uXHlTZv+sUBW/WpDrnxqFUYkKa8V0JtJzCDUZMZujmrVIQmN6wbcYO6kv3STQWZzZ1h2vIRpnLkM2LI2GwOyb5W/ryuxQp44j6o98Nwbe66Vi2ruB2Wnd52qnugmKl9ti3Z7JySN/NkaY2YgyPGwwl/XwR6VM9Dj22celM5A/xGNNeUnqxRsVHo1Scy88n7qn4Gl206wZlXZ5vVb0ho6dlCbRCazXHP8DQ+S3dEWZ1oNPlJm6LubvG0bfqnVy78qO3me+A0bvZ/KDmdqa1ztXEw+4tmB9hkfxhOv/0/</diagram></mxfile>
Binary file modified docs/figures/neofox_model.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@ ipython==7.16.1
ipykernel==5.3.4
#sphinx_rtd_theme==0.5.0
#insipid_sphinx_theme==0.2.1
pydata-sphinx-theme==0.4.1
pydata-sphinx-theme==0.4.1
Binary file added docs/resources/column_description.xlsx
Binary file not shown.
Binary file not shown.
55 changes: 50 additions & 5 deletions docs/source/01_overview.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,59 @@
# Overview

**TODO**
Welcome to the documentation of **NeoFox**!

Write here an overview of Neofox
[![DOI](https://zenodo.org/badge/294667387.svg)](https://zenodo.org/badge/latestdoi/294667387)
[![PyPI version](https://badge.fury.io/py/neofox.svg)](https://badge.fury.io/py/neofox)

## About NeoFox

Neoantigens are tumor-specific antigens encoded by somatic mutations. Their break down products (neoepitopes) are presented by the Major Histocompatibility Complex (MHC) on the surface of tumor cells enabling T-cells to recognize these neoepitope sequences as foreign. This neoantigen-specific T-cell recognition may induce a potent anti-tumoral response which make neoantigens highly interesting targets for cancer immunotherapy. Conventionally, candidates for neoantigens are predicted by mutation calling from tumor and normal genome sequencing, non synonymous mutations are
selected and then translated into small peptides or amino acid sequences. For the final step, algorithms that predict the likelihood of a neoantigen candidate sequence to be indeed a true neoantigen are required.
Several neoantigen features that describe the ability of a neoantigen candidate sequence to induce a T-cell response have been published in the last years.

## Data models
**NeoFox** (**NEO**antigen **F**eature toolb**OX**) is a python package that annotates a given set of neoantigen candidate sequences derived from point mutation with relevant neoantigen features.
NeoFox covers neoepitope prediction by MHC binding and ligand prediction, similarity/foreignness of a neoepitope candidate sequence, combinatorial features and machine learning approaches. A list of implemented features and their references are given in Table 1.

Protocol buffers is employed to model Neofox's input and output data: neoantigens, Major Histocompatibility Complex (MHC) alleles and annotations.
![Neofox model](../figures/neofox_model.png)
**Table 1**

| Name | Reference | DOI |
|---------------------------------------------------------|--------------------------------------------------------------------------|-------------------------------------------------------------------------------------------|
| MHC I binding affinity/rank score (netMHCpan-v4.0) | Jurtz et al., 2017, The Journal of Immunology | https://doi.org/10.4049/jimmunol.1700893 |
| MHC II binding affinity/rank score (netMHCIIpan-v3.2) | Jensen et al., 2018, Immunology | https://doi.org/10.1111/imm.12889 |
| MixMHCpred score v2.1 | Bassani-Sternberg et al., 2017, PLoS Comp Bio; Gfeller, 2018, J Immunol. | https://doi.org/10.1371/journal.pcbi.1005725 , https://doi.org/10.4049/jimmunol.1800914 |
| MixMHC2pred score v1.2 | Racle et al., 2019, Nat. Biotech. 2019 | https://doi.org/10.1038/s41587-019-0289-6 |
| Differential Agretopicity Index (DAI) | Duan et al., 2014, JEM; Ghorani et al., 2018, Ann Oncol. | https://doi.org/10.1084/jem.20141308 |
| Self-Similarity | Bjerregaard et al., 2017, Front Immunol. | https://doi.org/10.3389/fimmu.2017.01566 |
| IEDB immunogenicity | Calis et al., 2013, PLoS Comput Biol. | https://doi.org/10.1371/journal.pcbi.1003266 |
| Neoantigen dissimilarity | Richman et al., 2019, Cell Systems | https://doi.org/10.1016/j.cels.2019.08.009 |
| PHBR-I | Marty et al., 2017, Cell | https://doi.org/10.1016/j.cell.2017.09.050 |
| PHBR-II | Marty Pyke et al., 2018, Cell | https://doi.org/10.1016/j.cell.2018.08.048 |
| Generator rate | Rech et al., 2018, Cancer Immunology Research | https://doi.org/10.1158/2326-6066.CIR-17-0559 |
| Recognition potential | Łuksza et al., 2017, Nature; Balachandran et al, 2017, Nature | https://doi.org/10.1038/nature24473 , https://doi.org/10.1038/nature24462 |
| Vaxrank | Rubinsteyn, 2017, Front Immunol | https://doi.org/10.3389/fimmu.2017.01807 |
| Priority score | Bjerregaard et al., 2017, Cancer Immunol Immunother. | https://doi.org/10.1007/s00262-017-2001-3 |
| Tcell predictor | Besser et al., 2019, Journal for ImmunoTherapy of Cancer | https://doi.org/10.1186/s40425-019-0595-z |
| neoag | Smith et al., 2019, Cancer Immunology Research | https://doi.org/10.1158/2326-6066.CIR-19-0155 |


Besides comprehensive annotation of neoantigen candidates, NeoFox creates biologically meaningful representations of
neoantigens and related biological entities as programmatic models. For this purpose, Protocol buffers is employed to
model Neofox's input and output data: neoantigens, patients, MHC alleles and neoantigen feature annotations (Figure 1).
Of note, this modelling allows users to expand NeoFox by customized neoantigen features, e.g. for benchmarking studies.


**Figure 1**

![Neofox model](../figures/figure1_v3.png)

For detailed information about the required input data, output data and usage please refer to the [User guide](03_user_guide.rst).

The data models are described in more detail [here](05_models.md).

Happy annotation and modelling!

## Contact information
For questions, please contact Franziska Lang ([[email protected]](mailto:[email protected])) or Pablo Riesgo Ferreiro ([[email protected]](mailto:[email protected])).

## How to cite
Franziska Lang, & Pablo Riesgo Ferreiro. (2020). TRON-Bioinformatics/neofox: Neofox v0.4.0. Zenodo. http://doi.org/10.5281/zenodo.4090421
21 changes: 20 additions & 1 deletion docs/source/02_installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ export NEOFOX_MIXMHCPRED=`pwd`/MixMHCpred-2.1/MixMHCpred
Configure MixMHCpred as explained in the file `MixMHCpred-2.0.1/README`

### Install MixMHC2pred 1.2 (recommended but optional)

```
wget https://github.com/GfellerLab/MixMHC2pred/archive/v1.2.tar.gz
tar -xvf v1.2.tar.gz
Expand Down Expand Up @@ -101,4 +102,22 @@ caret
Peptides
doParallel
gbm
```
```

Add the reference folder to the Path
```
export NEOFOX_REFERENCE_FOLDER=path/to/reference/folder
```

## Test installation

The user can test if all the installations have been successful by testing NeoFox with some test data. The test data can be downloaded here:
:download:`test_data.txt <test_data.txt>`
:download:`test_patients.txt <test_patients.txt>`

````commandline
neofox --model-file /path/to/test_data.txt --patient-data /path/to/test_patients.txt --output-folder /path/to/outputfolder --with-short-wide-table --with-tall-skinny-table --with-json --output-prefix test
````

The resulting output files can be compared to the following test output file:
:download:`test_neoantigen_candidates_annotated.tsv <test_neoantigen_candidates_annotated.tsv>`
Loading

0 comments on commit fa010e9

Please sign in to comment.