Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add wrapper for rbt vcf_fix_iupac_alleles #1209

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions bio/rbt/vcf_fix_iupac_alleles/environment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
channels:
- conda-forge
- bioconda
- nodefaults
dependencies:
- rust-bio-tools =0.42.0
- bcftools =1.17
- snakemake-wrapper-utils =0.5.3
10 changes: 10 additions & 0 deletions bio/rbt/vcf_fix_iupac_alleles/meta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
name: rbt vcf-fix-iupac-alleles
description: |
Convert any IUPAC codes in alleles into Ns (in order to comply with VCF 4 specs).
url: https://github.com/rust-bio/rust-bio-tools
authors:
- Filipe G. Vieira
input:
- VCF
output:
- VCF/BCF
22 changes: 22 additions & 0 deletions bio/rbt/vcf_fix_iupac_alleles/test/Snakefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@

rule vcf_fix_iupac_alleles_vcf:
input:
"{file}.vcf.gz",
output:
vcf="results/{file}.vcf.gz",
log:
"logs/{file}.vcf.log",
params:
extra="",
threads: 1
resources:
mem_mb=1024,
wrapper:
"master/bio/rbt/vcf_fix_iupac_alleles"


use rule vcf_fix_iupac_alleles_vcf as vcf_fix_iupac_alleles_bcf with:
output:
bcf="results/{file}.bcf",
log:
"logs/{file}.bcf.log",
Binary file not shown.
18 changes: 18 additions & 0 deletions bio/rbt/vcf_fix_iupac_alleles/wrapper.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
__author__ = "Filipe G. Vieira"
__copyright__ = "Copyright 2023, Filipe G. Vieira"
__license__ = "MIT"

import tempfile
from snakemake.shell import shell
from snakemake_wrapper_utils.bcftools import get_bcftools_opts


bcftools_opts = get_bcftools_opts(snakemake, parse_ref=False)
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
extra = snakemake.params.get("extra", "")


with tempfile.TemporaryDirectory() as tmpdir:
shell(
"(rbt vcf-fix-iupac-alleles {extra} < {snakemake.input[0]} | bcftools sort --temp-dir {tmpdir} {bcftools_opts}) {log}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the sort? rbt vcf-fix-iupac-alleles does not change the order of the records.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main idea was to allow for automatic output on any format (bcf, vcf, vcf.gz).
Since I was using it to standardize resources (known variation from ensembl), I thought that sort would be a nice-to-have (just to be sure the output was sorted) and the extra runtime would be ok since it would only be run once.

But I can also remove it (or switch to view), if you prefer to keep the wrapper more general.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

)
15 changes: 14 additions & 1 deletion test.py
Original file line number Diff line number Diff line change
Expand Up @@ -6121,13 +6121,26 @@ def test_verifybamid2():


@skip_if_not_modified
def test_collapse_reads_to_fragments_bam():
def test_rbt_collapse_reads_to_fragments_bam():
run(
"bio/rbt/collapse_reads_to_fragments-bam",
["snakemake", "--cores", "1", "--use-conda", "-F"],
)


@skip_if_not_modified
def test_rbt_vcf_fix_iupac_alleles():
run(
"bio/rbt/vcf_fix_iupac_alleles",
["snakemake", "--cores", "1", "--use-conda", "-F", "results/homo_sapiens-chrMT.vcf.gz"],
)

run(
"bio/rbt/vcf_fix_iupac_alleles",
["snakemake", "--cores", "1", "--use-conda", "-F", "results/homo_sapiens-chrMT.bcf"],
)


@skip_if_not_modified
def test_gatk_mutect2_calling_meta():
run(
Expand Down
Loading