Skip to content

Commit

Permalink
Merge branch 'develop' into 'master'
Browse files Browse the repository at this point in the history
Release v1.2.0

See merge request tron/addannot!281
  • Loading branch information
franla23 committed Jul 11, 2024
2 parents a1512f3 + 82522b9 commit c0b5282
Show file tree
Hide file tree
Showing 76 changed files with 2,118 additions and 1,137 deletions.
7 changes: 4 additions & 3 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,12 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install setuptools wheel twine
pip install poetry==1.8.2 twine
- name: Build and publish
env:
TWINE_USERNAME: __token__
TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
run: |
python setup.py sdist bdist_wheel
twine upload dist/*
poetry install
poetry build
twine upload dist/*.whl
8 changes: 5 additions & 3 deletions .gitlab-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
## along with this program. If not, see <http://www.gnu.org/licenses/>.##

# this image contains multiple Python interpreters
image: python:3.8.14-buster
image: python:3.11.9-bookworm

# Change pip's cache directory to be inside the project directory since we can
# only cache local items.
Expand All @@ -37,6 +37,7 @@ before_script:
- apt-get update
- apt-get --assume-yes install gcc gfortran build-essential wget libfreetype6-dev libpng-dev libopenblas-dev
- python -V
- pip install poetry==1.8.2

stages:
- validation
Expand All @@ -48,7 +49,7 @@ check_version_changes:
script:
# if the version number does not change between this branch and develop it fails
- git fetch origin develop
- if git diff origin/develop -- neofox/__init__.py | grep VERSION; then exit 0; else echo "Version needs to be increased!"; exit -1; fi
- if git diff origin/develop -- pyproject.toml | grep version; then exit 0; else echo "Version needs to be increased!"; exit -1; fi
except:
- develop
- master
Expand All @@ -63,7 +64,8 @@ publish_package:
stage: deploy
script:
- pip install twine
- python3 setup.py sdist bdist_wheel
- poetry install
- poetry build
- TWINE_PASSWORD=${CI_JOB_TOKEN} TWINE_USERNAME=gitlab-ci-token python -m twine upload --repository-url https://gitlab.rlp.net/api/v4/projects/${CI_PROJECT_ID}/packages/pypi dist/*
only:
# deploys in private gitlab package repository only the develop branch, the master branch is published in PyPI
Expand Down
12 changes: 0 additions & 12 deletions MANIFEST.in

This file was deleted.

47 changes: 28 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@

NeoFox annotates neoantigen candidate sequences with published neoantigen features.

For a detailed documentation, please check out [https://neofox.readthedocs.io](https://neofox.readthedocs.io/)
**For a detailed documentation, please check out [https://neofox.readthedocs.io](https://neofox.readthedocs.io/)**

If you use NeoFox, please cite the following publication:
Franziska Lang, Pablo Riesgo-Ferreiro, Martin Löwer, Ugur Sahin, Barbara Schrörs, **NeoFox: annotating neoantigen candidates with neoantigen features**, Bioinformatics, Volume 37, Issue 22, 15 November 2021, Pages 4246–4247, https://doi.org/10.1093/bioinformatics/btab344
Expand All @@ -28,7 +28,7 @@ NeoFox covers the following neoantigen features and prediction algorithms:
| Name | Reference | DOI |
|---------------------------------------------------------|--------------------------------------------------------------------------|-------------------------------------------------------------------------------------------|
| MHC I binding affinity/rank score (netMHCpan-v4.1) | Reynisson et al, 2020, Nucleic Acids Research | https://doi.org/10.4049/jimmunol.1700893 |
| MHC II binding affinity/rank score (netMHCIIpan-v4.0) | Reynisson et al, 2020, Nucleic Acids Research | https://doi.org/10.1111/imm.12889 |
| MHC II binding affinity/rank score (netMHCIIpan-v4.3) | Nilsson et al, 2023, Science Adv. | https://doi.org/10.1126/sciadv.adj6367 |
| MixMHCpred score v2.2 § | Bassani-Sternberg et al., 2017, PLoS Comp Bio; Gfeller, 2018, J Immunol. | https://doi.org/10.1371/journal.pcbi.1005725 , https://doi.org/10.4049/jimmunol.1800914 |
| MixMHC2pred score v2.0.2 § | Racle et al, 2019, Nat. Biotech. 2019 | https://doi.org/10.1038/s41587-019-0289-6 |
| Differential Agretopicity Index (DAI) | Duan et al, 2014, JEM; Ghorani et al., 2018, Ann Oncol. | https://doi.org/10.1084/jem.20141308 |
Expand All @@ -41,7 +41,6 @@ NeoFox covers the following neoantigen features and prediction algorithms:
| Recognition potential § | Łuksza et al, 2017, Nature; Balachandran et al, 2017, Nature | https://doi.org/10.1038/nature24473 , https://doi.org/10.1038/nature24462 |
| Vaxrank | Rubinsteyn, 2017, Front Immunol | https://doi.org/10.3389/fimmu.2017.01807 |
| Priority score | Bjerregaard et al, 2017, Cancer Immunol Immunother. | https://doi.org/10.1007/s00262-017-2001-3 |
| Tcell predictor | Besser et al, 2019, Journal for ImmunoTherapy of Cancer | https://doi.org/10.1186/s40425-019-0595-z |
| PRIME § | Schmidt et al., 2021, Cell Reports Medicine | https://doi.org/10.1016/j.xcrm.2021.100194 |
| HEX § | Chiaro et al., 2021, Cancer Immunology Research | https://doi.org/10.1158/2326-6066.CIR-20-0814 |

Expand All @@ -52,10 +51,9 @@ NeoFox covers the following neoantigen features and prediction algorithms:
NeoFox depends on the following tools:

- Python >=3.7, <=3.8
- R 3.6.0
- BLAST 2.10.1
- netMHCpan 4.1
- netMHCIIpan 4.0
- netMHCIIpan 4.3
- MixMHCpred 2.2 (optional)
- MixMHC2pred 2.0.2 (optional)
- PRIME 2.0 (optional)
Expand All @@ -76,30 +74,41 @@ conda install bioconda::neofox
NeoFox can be used from the command line as shown below or programmatically (see [https://neofox.readthedocs.io](https://neofox.readthedocs.io/) for more information).

````commandline
neofox --candidate-file/--json-file neoantigens_candidates.tab/neoantigens_candidates.json --patient-data/--patient-data-json patient_data.txt/patient_data.json --output-folder /path/to/out --output-prefix out_prefix [--patient-id] [--with-table] [--with-json] [--num-cpus] [--affinity-threshold]
neofox --input-file neoantigens_candidates.tsv \
--patient-data patient_data.txt \
--output-folder /path/to/out \
[--output-prefix out_prefix] \
[--organism human|mouse] \
[--rank-mhci-threshold 2.0] \
[--rank-mhcii-threshold 5.0] \
[--num-cpus] \
[--config] \
[--patient-id] \
[--with-all-neoepitopes] \
[--verbose]
````
- `--candidate-file`: tab-separated values table with neoantigen candidates represented by long mutated peptide sequences as described [here](#41-neoantigen-candidates-in-tabular-format)
- `--json-file`: JSON file neoantigens in NeoFox model format as described [here](#42-neoantigen-candidates-in-json-format)
- `--patient-id`: patient identifier (*optional*, this will be used if the column `patientIdentifier` is missing in the candidate input file)
- `--patient-data`: a table of tab separated values containing metadata on the patient as described [here](#43-patient-data-format)
- `--input-file`: tab-separated values table with neoantigen candidates represented by long mutated peptide sequences
as described [here](03_01_input_data.md#tabular-file-format) (extensions .txt and .tsv) or JSON file neoantigens in
NeoFox model format as described [here](03_01_input_data.md#json-file-format) (extension .json)
- `--patient-data`: a table of tab separated values containing metadata on the patient as described [here](03_01_input_data.md#file-with-patient-information)
- `--output-folder`: path to the folder to which the output files should be written
- `--output-prefix`: prefix for the output files (*optional*)
- `--with-table`: output file in tab-separated format (*default*)
- `--with-json`: output file in JSON format (*optional*)
- `--with-all-neoepitopes`: output annotations for all MHC-I and MHC-II neoepitopes on all HLA alleles (*optional*)
- `--rank-mhci-threshold`: MHC-I epitopes with a netMHCpan predicted rank greater than or equal than this threshold will be filtered out (*optional*)
- `--rank-mhcii-threshold`: MHC-II epitopes with a netMHCIIpan predicted rank greater than or equal than this threshold will be filtered out (*optional*)
- `--organism`: the organism to which the data corresponds. Possible values: [human, mouse]. Default value: human
- `--num-cpus`: number of CPUs to use (*optional*)
- `--config`: a config file with the paths to dependencies as shown below (*optional*)
- `--organism`: the organism to which the data corresponds. Possible values: [human, mouse]. Default value: human
- `--affinity-threshold`: a affinity value (*optional*) neoantigen candidates with a best predicted affinity greater than or equal than this threshold will be not annotated with features that specifically model
neoepitope recognition. A threshold that is commonly used is 500 nM.
- `--patient-id`: patient identifier (*optional*, this is only relevant if the column `patientIdentifier` is missing in the candidate input file)
- `--verbose`: get detailed logs


The optional config file with the paths to the dependencies can look like this:
````commandline
NEOFOX_REFERENCE_FOLDER=path/to/reference/folder
NEOFOX_RSCRIPT=`which Rscript`
NEOFOX_BLASTP=path/to/ncbi-blast-2.10.1+/bin/blastp
NEOFOX_NETMHCPAN=path/to/netMHCpan-4.1/netMHCpan
NEOFOX_NETMHC2PAN=path/to/netMHCIIpan-4.0/netMHCIIpan
NEOFOX_NETMHC2PAN=path/to/netMHCIIpan-4.3/netMHCIIpan
NEOFOX_MIXMHCPRED=path/to/MixMHCpred-2.2/MixMHCpred
NEOFOX_MIXMHC2PRED=path/to/MixMHC2pred-2.0.1/MixMHC2pred_unix
NEOFOX_MAKEBLASTDB=path/to/ncbi-blast-2.8.1+/bin/makeblastdb
Expand Down Expand Up @@ -158,9 +167,9 @@ where:
- `identifier`: the patient identifier
- `mhcIAlleles`: comma separated MHC I alleles of the patient for HLA-A, HLA-B and HLA-C. If homozygous, the allele should be added twice.
- `mhcIIAlleles`: comma separated MHC II alleles of the patient for HLA-DRB1, HLA-DQA1, HLA-DQB1, HLA-DPA1 and HLA-DPB1. If homozygous, the allele should be added twice.
- `tumorType`: tumour entity in TCGA study abbreviation format (https://gdc.cancer.gov/resources-tcga-users/tcga-code-tables/tcga-study-abbreviations). This field is required for expression imputation and at the moment the following tumor types are supported:
- `tumorType`: tumour entity in TCGA study abbreviation format (https://gdc.cancer.gov/resources-tcga-users/tcga-code-tables/tcga-study-abbreviations). This field is required for expression imputation. The supported tumor types are listed under "Input data" in the [documentation](https://neofox.readthedocs.io/en/latest/03_01_input_data.html).


## 5 Output data

The output data is returned by default in a short wide tab separated values file (`--with-table`). Optionally, it can be provided in JSON format (`--with-json`).
The output data is returned by default in tsv and json format. With the command line flag `--with-all-neoepitopes`, two additional files are generated containing the epitope candidates for MHCI and MHCII with NetMHCpan predictions below the given thresholds.
3 changes: 2 additions & 1 deletion docs/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,5 @@ ipykernel==6.4.1
#insipid_sphinx_theme==0.2.1
pydata-sphinx-theme==0.11.0
Jinja2==3.1.2
markupsafe==2.1.1
markupsafe==2.1.1
lxml-html-clean==0.1.1
7 changes: 3 additions & 4 deletions docs/source/01_overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,14 +22,14 @@ in the last years.
NeoFox supports annotation of neoantigen candidates derived from SNVs (single nucleotide variant) and alternative mutation classes such as INDELs or fusion genes. Furthermore, NeoFox supports both human and mouse derived neoantigen candidates.

NeoFox covers neoepitope prediction by MHC binding and ligand prediction, similarity/foreignness of a neoepitope candidate sequence, combinatorial features and machine learning approaches.
A list of implemented features and their references are given in Table 1. Please not that some features are currently not available for mouse.
A list of implemented features and their references are given in Table 1. Please note that some features are currently not available for mouse.

**Table 1**: Neoantigen features and prioritization algorithms (*§ currently not supported for mouse*)

| Name | Reference | DOI |
|---------------------------------------------------------|--------------------------------------------------------------------------|-------------------------------------------------------------------------------------------|
| MHC I binding affinity/rank score (netMHCpan-v4.1) | Reynisson et al., 2020, Nucleic Acids Res. | https://doi.org/10.1093/nar/gkaa379 |
| MHC II binding affinity/rank score (netMHCIIpan-v4.0) | Reynisson et al., 2020, Nucleic Acids Res. | https://doi.org/10.1093/nar/gkaa379 |
| MHC II binding affinity/rank score (netMHCIIpan-v4.3) | Nilsson et al., 2023, Science Adv. | https://doi.org/10.1126/sciadv.adj6367 |
| MixMHCpred score v2.2 § | Bassani-Sternberg et al., 2017, PLoS Comp Bio; Gfeller, 2018, J Immunol. | https://doi.org/10.1371/journal.pcbi.1005725 , https://doi.org/10.4049/jimmunol.1800914 |
| MixMHC2pred score v2.0.2 § | Racle et al., 2019, Nat. Biotech. 2019 | https://doi.org/10.1038/s41587-019-0289-6 |
| Differential Agretopicity Index (DAI) | Duan et al., 2014, JEM; Ghorani et al., 2018, Ann Oncol. | https://doi.org/10.1084/jem.20141308 |
Expand All @@ -42,7 +42,6 @@ A list of implemented features and their references are given in Table 1. Please
| Recognition potential § | Łuksza et al., 2017, Nature; Balachandran et al, 2017, Nature | https://doi.org/10.1038/nature24473 , https://doi.org/10.1038/nature24462 |
| Vaxrank | Rubinsteyn, 2017, Front Immunol | https://doi.org/10.3389/fimmu.2017.01807 |
| Priority score | Bjerregaard et al., 2017, Cancer Immunol Immunother. | https://doi.org/10.1007/s00262-017-2001-3 |
| Tcell predictor | Besser et al., 2019, Journal for ImmunoTherapy of Cancer | https://doi.org/10.1186/s40425-019-0595-z |
| PRIME v2.0 § | Schmidt et al., 2021, Cell Reports Medicine | https://doi.org/10.1016/j.xcrm.2021.100194 |
| HEX § | Chiaro et al., 2021, Cancer Immunology Research | https://doi.org/10.1158/2326-6066.CIR-20-0814 |

Expand All @@ -67,4 +66,4 @@ Happy annotation and modelling!
For questions, please contact Franziska Lang ([[email protected]](mailto:[email protected])) or Pablo Riesgo Ferreiro ([[email protected]](mailto:[email protected])).

## How to cite
Franziska Lang, Pablo Riesgo-Ferreiro, Martin Löwer, Ugur Sahin, Barbara Schrörs, **NeoFox: annotating neoantigen candidates with neoantigen features**, Bioinformatics, Volume 37, Issue 22, 15 November 2021, Pages 4246–4247, [https://doi.org/10.1093/bioinformatics/btab344](https://doi.org/10.1093/bioinformatics/btab344)
Franziska Lang, Pablo Riesgo-Ferreiro, Martin Löwer, Ugur Sahin, Barbara Schrörs, **NeoFox: annotating neoantigen candidates with neoantigen features**, Bioinformatics, Volume 37, Issue 22, 15 November 2021, Pages 4246–4247, [https://doi.org/10.1093/bioinformatics/btab344](https://doi.org/10.1093/bioinformatics/btab344)
Loading

0 comments on commit c0b5282

Please sign in to comment.