Skip to content

Commit

Permalink
Making virus base config from demo configs
Browse files Browse the repository at this point in the history
- We did provide demo config files for several viruses
- Now they can be included with virus-base-config
  • Loading branch information
DrYak committed Jun 7, 2024
1 parent d840e80 commit fdf00be
Show file tree
Hide file tree
Showing 6 changed files with 24 additions and 106 deletions.
6 changes: 6 additions & 0 deletions config/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,12 @@ Currently, the following _virus base config_ are available:
- [hiv](hiv.yaml): provides HXB2 as a reference sequence for HIV, and sets the default aligner to _ngshmmalign_.
- [sars-cov-2](sars-cov-2.yaml): provides NC\_045512.2 as a reference sequence for SARS-CoV-2, sets the default aligner to _bwa_ and sets the variant calling to be done against the reference instead of the cohort's consensus.
In addition, a look-up for the recent versions of ARTIC protocol is provided; this makes it possible to set per-sample protocol in the sample table, and to turn on amplicon trimming (see [amplicon protocols](#amplicon-protocols)).
- [rsvb](rsvb.yaml) config file for Human respiratory syncytial virus B (RSV-B), used to process Illumina RSV samples.
- [h3n2_ha](h3n2_ha.yaml) config-file used for the analysis of H3N2 segment HA from wastewater Illumina data on SRA available through the SRA Run accession: [SRP385331](https://www.ebi.ac.uk/ena/browser/view/PRJNA856656)
- [drosophila_c_virus](drosophila_c_virus.yaml) configuration used for the analysis of drosphila C virus (DCV) Illumina samples in Lezcano et al., Virus Evolution, 2023, doi:[10.1093/ve/vead074](https://doi.org/10.1093/ve/vead074), NCBI BioProject accession number [PRJNA993483](https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA993483)
- [](herpes_simplex_virus_2.yaml) config-file used for the analysis of herpes simplex virus 2 (HSV-2) Illumina samples in Lezcano et al., Virus Evolution, 2023, doi:[10.1093/ve/vead074](https://doi.org/10.1093/ve/vead074). Analysed sample is from López-Muñoz AD, Rastrojo A, Kropp KA, Viejo-Borbolla A, Alcamí A. "Combination of long- and short-read sequencing fully resolves complex repeats of herpes simplex virus 2 strain MS complete genome". _Microb Genom._ 2021 Jun;7(6). Sample accession number [ERR3278849](https://www.ebi.ac.uk/ena/browser/view/ERR3278849)
- [polio](polio.yaml) config-file used for the analysis of poliovirus MinION samples. sample accession number: [ERR4027774]](https://www.ebi.ac.uk/ena/browser/view/ERR4027774) (Shaw et al., 2020, DOI: https://doi.org/10.1128/jcm.00920-20)
- [mpxv](mpxv.yaml) Monkey pox virus
### configuration manual
Expand Down
19 changes: 5 additions & 14 deletions config/drosophila_c_virus.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
---
name: Drosphila C virus
# config-file used for the analysis of drosphila C virus Illumina samples
# Lezcano et al., Virus Evolution, 2023, https://doi.org/10.1093/ve/vead074
# NCBI BioProject accession number PRJNA993483
Expand All @@ -8,25 +10,14 @@ general:

input:
reference: resources/drosphila_c_virus/NC_001834.1.fasta
datadir: resources/samples/
read_length: 100
samples_file: samples.tsv
paired: true

consensus_bcftools:
max_coverage: 150000

lofreq:
consensus: false

snv:
consensus: false
disk_mb: 1250
mem_mb: 35000
time_min: 6000
threads: 64

output:
snv: true
local: true
global: false
visualization: false
diversity: false
QA: false
56 changes: 2 additions & 54 deletions config/h3n2_ha.yaml
Original file line number Diff line number Diff line change
@@ -1,75 +1,23 @@
---
name: Influenza A virus subtype H3N2
# config-file used for the analysis of H3N2 segment HA
# config file was used to analysis wastewater Illumina data from SRA available
# through the SRA Run accession: SRP385331

general:
aligner: bwa
primers_trimmer: samtools
threads: 6
snv_caller: lofreq
temp_prefix: ./temp
preprocessor: skip

input:
datadir: samples/
samples_file: samples.tsv
reference: "{VPIPE_BASEDIR}/../resources/h3n2_ha/h3n2_ha.fasta"
genes_gff: "{VPIPE_BASEDIR}/../resources/h3n2_ha/gffs/h3n2_ha.gff3"
paired: true
read_length: 151

output:
datadir: results/
snv: True
local: True
global: False
visualization: False
QA: False
diversity: False

gunzip:
mem: 100000

extract:
mem: 100000

preprocessing:
mem: 10000

sam2bam:
mem: 5000

ref_bwa_index:
mem: 65536

bwa_align:
mem: 40690
threads: 8

bowtie_align:
mem: 12288
threads: 6

coverage:
mem: 131072
threads: 32
time: 60

minor_variants:
mem: 16384
threads: 64

coverage_intervals:
coverage: 0
mem: 2000
threads: 1

lofreq:
consensus: false

snv:
consensus: false
localscratch: $TMPDIR
time: 240
mem: 1024
threads: 64
15 changes: 2 additions & 13 deletions config/herpes_simplex_virus_2.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
---
name: Herpes simplex Virus 2
# config-file used for the analysis of herpes_simplex_virus_2 Illumina samples
# Lezcano et al., Virus Evolution, 2023, https://doi.org/10.1093/ve/vead074
# deletion analysis
Expand All @@ -12,23 +14,10 @@ input:
reference: resources/herpes_simplex_virus_2/MK855052.1.fasta
datadir: resources/samples/
read_length: 250
samples_file: samples.tsv
paired: true

consensus_bcftools:
max_coverage: 150000

snv:
consensus: false
disk_mb: 1250
mem_mb: 35000
time_min: 6000
threads: 64

output:
snv: true
local: true
global: false
visualization: false
diversity: false
QA: false
14 changes: 2 additions & 12 deletions config/polio.yaml
Original file line number Diff line number Diff line change
@@ -1,25 +1,15 @@
---
name: Poliovirus (MinION)
# config-file used for the analysis of poliovirus MinION samples
# sample accession number: ERR4027774 (Shaw et al., 2020, DOI: https://doi.org/10.1128/jcm.00920-20)

general:
virus_base_config: ""
aligner: minimap
preprocessor: skip

input:
reference: resources/polio/AY560657.1.fasta
datadir: resources/samples/
samples_file: config/samples.tsv
paired: false

output:
trim_primers: false
snv: false
local: false
global: false
visualization: false
QA: false
diversity: false

minimap_align:
preset: map-ont
20 changes: 7 additions & 13 deletions config/rsvb.yaml
Original file line number Diff line number Diff line change
@@ -1,27 +1,21 @@
---
name: Respiratory Syncytial Virus B
# config file for Human respiratory syncytial virus B
# config file is used to process Illumina RSV samples

general:
virus_base_config: ""
preprocessor: "prinseq"
aligner: "bwa"
primers_trimmer: "samtools"

input:
datadir: "samples/"
samples_file: "samples.tsv"
read_length: 251
reference: "{VPIPE_BASEDIR}/../resources/rsvb/MT107528.1.fasta"
primers_bedfile: "{VPIPE_BASEDIR}/../resources/rsvb/RSVB_primers_400_V2.1.bed"
inserts_bedfile: "{VPIPE_BASEDIR}/../resources/rsvb/RSVB_inserts_400_V2.1.bed"
reference: "{VPIPE_BASEDIR}/../resources/rsvb/MT107528.1.fasta"
read_length: 251

output:
datadir: "results"
trim_primers: true
snv: false
local: false
global: false
snv:
consensus: false
lofreq:
consensus: false

snv:
consensus: false

0 comments on commit fdf00be

Please sign in to comment.