Skip to content

Commit

Permalink
[hicap] expose optional parameters (#724)
Browse files Browse the repository at this point in the history
* Enable optional parameters in hicap

* Remove "--full_sequence" parameter and other comments that are no longer needed.

* Update md5sum for wf_merlin_magic in illumina PE and SE test workflows

* Updated documentation to reflect additional optional parameters in the merlin_magic workflow

* Update md5sum for wf_merlin_magic in test workflows
  • Loading branch information
MrTheronJ authored Jan 17, 2025
1 parent 2a8c304 commit 204aff5
Show file tree
Hide file tree
Showing 5 changed files with 27 additions and 22 deletions.
4 changes: 4 additions & 0 deletions docs/workflows/genomic_characterization/theiaprok.md
Original file line number Diff line number Diff line change
Expand Up @@ -330,7 +330,11 @@ All input reads are processed through "[core tasks](#core-tasks-performed-for-al
| merlin_magic | **ectyper_verify** | Boolean | Set to true to enable E. coli species verification | False | Optional | FASTA, ONT, PE, SE |
| merlin_magic | **emmtypingtool_docker_image** | String | The Docker container to use for the task | us-docker.pkg.dev/general-theiagen/staphb/emmtypingtool:0.0.1 | Optional | FASTA, ONT, PE, SE |
| merlin_magic | **genotyphi_docker_image** | String | The Docker container to use for the task | us-docker.pkg.dev/general-theiagen/staphb/mykrobe:0.11.0 | Optional | FASTA, ONT, PE, SE |
| merlin_magic | **hicap_broken_gene_identity** | Float | Minimum percentage identity to consider a broken gene | 0.80 | Optional | FASTA, ONT, PE, SE |
| merlin_magic | **hicap_broken_gene_length** | Int | Minimum length to consider a broken gene | 60 | Optional | FASTA, ONT, PE, SE |
| merlin_magic | **hicap_docker_image** | String | The Docker container to use for the task | us-docker.pkg.dev/general-theiagen/biocontainers/hicap:1.0.3--py_0 | Optional | FASTA, ONT, PE, SE |
| merlin_magic | **hicap_gene_coverage** | Float | Minimum percentage coverage to consider a single gene complete | 0.80 | Optional | FASTA, ONT, PE, SE |
| merlin_magic | **hicap_gene_identity** | Float | Minimum percentage identity to consider a single gene complete | 0.70 | Optional | FASTA, ONT, PE, SE |
| merlin_magic | **kaptive_docker_image** | String | The Docker container to use for the task | us-docker.pkg.dev/general-theiagen/staphb/kaptive:2.0.3 | Optional | FASTA, ONT, PE, SE |
| merlin_magic | **kaptive_low_gene_id** | Float | Percent identity threshold for what counts as a low identity match in the gene BLAST search | 95 | Optional | FASTA, ONT, PE, SE |
| merlin_magic | **kaptive_min_coverage** | Float | Minimum required percent identity for the gene BLAST search via tBLASTn | 80 | Optional | FASTA, ONT, PE, SE |
Expand Down
30 changes: 11 additions & 19 deletions tasks/species_typing/haemophilus/task_hicap.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -7,27 +7,14 @@ task hicap {
input {
File assembly
String samplename
Float gene_coverage = 0.80 #Minimum percentage coverage to consider a single gene complete. [default: 0.80]
Float gene_identity = 0.70 #Minimum percentage identity to consider a single gene complete. [default: 0.70]
Float broken_gene_identity = 0.80 #Minimum percentage identity to consider a broken gene. [default: 0.80]
Int broken_gene_length = 60 #Minimum length to consider a broken gene. [default: 60]
String docker = "us-docker.pkg.dev/general-theiagen/biocontainers/hicap:1.0.3--py_0"
Int cpu = 2
Int memory = 8
Int disk_size = 50

#Parameters
#-q QUERY_FP, --query_fp QUERY_FP Input FASTA query
#-o OUTPUT_DIR, --output_dir OUTPUT_DIR Output directory
#-d DATABASE_DIR, --database_dir DATABASE_DIR Directory containing locus database. [default: /usr/local/lib/python3.6/site-
# packages/hicap/database]
#-m MODEL_FP, --model_fp MODEL_FP Path to prodigal model. [default: /usr/local/lib/python3.6/site-
# packages/hicap/model/prodigal_hi.bin]
#-s, --full_sequence Write the full input sequence out to the genbank file rather than just the region
# surrounding and including the locus
#--gene_coverage GENE_COVERAGE Minimum percentage coverage to consider a single gene complete. [default: 0.80]
#--gene_identity GENE_IDENTITY Minimum percentage identity to consider a single gene complete. [default: 0.70]
#--broken_gene_length BROKEN_GENE_LENGTH Minimum length to consider a broken gene. [default: 60]
#--broken_gene_identity BROKEN_GENE_IDENTITY Minimum percentage identity to consider a broken gene. [default: 0.80]
#--threads THREADS Threads to use for BLAST+. [default: 1]
#--log_fp LOG_FP Record logging messages to file
#--debug Print debug messages
}
command <<<
echo $(hicap --version 2>&1) | sed 's/^hicap //' | tee VERSION
Expand All @@ -36,8 +23,13 @@ task hicap {

hicap \
-q ~{assembly} \
-o output_dir

-o output_dir \
--gene_coverage ~{gene_coverage} \
--gene_identity ~{gene_identity} \
--broken_gene_length ~{broken_gene_length} \
--broken_gene_identity ~{broken_gene_identity} \
--threads ~{cpu}

filename=$(basename ~{assembly})

# if there are no hits for a cap locus, no file is produced
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -516,7 +516,7 @@
- path: miniwdl_run/wdl/workflows/theiaprok/wf_theiaprok_illumina_pe.wdl
md5sum: 9b8e2da62c8572a369c786a9bbc3a36e
- path: miniwdl_run/wdl/workflows/utilities/wf_merlin_magic.wdl
md5sum: 3547638e39dcf408234a6c1df57d198d
md5sum: 1bcb64a5e28d26e3603f456e99bfa7b6
- path: miniwdl_run/wdl/workflows/utilities/wf_read_QC_trim_pe.wdl
contains: ["version", "QC", "output"]
- path: miniwdl_run/workflow.log
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -487,7 +487,7 @@
- path: miniwdl_run/wdl/workflows/theiaprok/wf_theiaprok_illumina_se.wdl
md5sum: 02dc0075bf28d557d7b81aa2dc61feab
- path: miniwdl_run/wdl/workflows/utilities/wf_merlin_magic.wdl
md5sum: 3547638e39dcf408234a6c1df57d198d
md5sum: 1bcb64a5e28d26e3603f456e99bfa7b6
- path: miniwdl_run/wdl/workflows/utilities/wf_read_QC_trim_se.wdl
md5sum: 09d9f68b9ca8bf94b6145ff9bed2edd1
- path: miniwdl_run/workflow.log
Expand Down
11 changes: 10 additions & 1 deletion workflows/utilities/wf_merlin_magic.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,11 @@ workflow merlin_magic {
Int? emmtyper_min_perfect
Int? emmtyper_min_good
Int? emmtyper_max_size
#hicap options
Float? hicap_gene_coverage
Float? hicap_gene_identity
Float? hicap_broken_gene_identity
Int? hicap_broken_gene_length
# kaptive options
Int? kaptive_start_end_margin
Float? kaptive_min_identity
Expand Down Expand Up @@ -600,7 +605,11 @@ workflow merlin_magic {
input:
assembly = assembly,
samplename = samplename,
docker = hicap_docker_image
docker = hicap_docker_image,
gene_coverage = hicap_gene_coverage,
gene_identity = hicap_gene_identity,
broken_gene_identity = hicap_broken_gene_identity,
broken_gene_length = hicap_broken_gene_length
}
}
if (merlin_tag == "Vibrio" || merlin_tag == "Vibrio cholerae") {
Expand Down

0 comments on commit 204aff5

Please sign in to comment.