Skip to content

Commit

Permalink
Merge branch 'release/1.3.3'
Browse files Browse the repository at this point in the history
  • Loading branch information
rhshah committed Aug 8, 2023
2 parents c8e3cad + 0e52034 commit cf66cd5
Show file tree
Hide file tree
Showing 9 changed files with 232 additions and 32 deletions.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ outputs:
label: general_stats_parse
requirements:
- class: DockerRequirement
dockerPull: 'ghcr.io/msk-access/cci_utils:0.3.0'
dockerPull: 'ghcr.io/msk-access/cci_utils:0.3.1'
- class: InitialWorkDirRequirement
listing:
- entry: $(inputs.directory)
Expand Down
8 changes: 3 additions & 5 deletions docs/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,16 +112,16 @@
* [vardict_filter_single-sample 0.1.5](postprocessing_variant_calls/vardict_filter_single-sample_0.1.5.md)
* [maf_annotated_by_bed_0.2.2](postprocessing_variant_calls/maf_annotated_by_bed_0.2.2.md)

* [SnpSift](snpsift/README.md)
* [sequence_qc](sequence_qc/README.md)
* [v0.2.4](sequence_qc/sequence_qc_v0.2.24.md)

* [SnpSift](snpsift/README.md)
* [v5.0](snpsift/snpsift_5.0.md)

* [Trim Galore](trim-galore/README.md)

* [v0.6.2](trim-galore/trim_galore_0.6.2.md)

* [Ubuntu utilites](ubuntu-utilites/README.md)

* [v18.04](ubuntu-utilites/utilities_ubuntu_18.04.md)

* [VarDictJava](vardictjava/README.md)
Expand All @@ -133,5 +133,3 @@
* [Waltz](waltz/README.md)
* [CountReads v3.1.1](waltz/waltz_count_reads_3.1.1.md)
* [PileupMetrics v3.1.1](waltz/waltz_pileupmatrices_3.1.1.md)


Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

| Tool | Version | Location |
|--- |--- |--- |
| cci_utils | 0.3.0 | <https://github.com/msk-access/cci_utils> |
| cci_utils | 0.3.1 | <https://github.com/msk-access/cci_utils> |

## CWL

Expand Down
2 changes: 2 additions & 0 deletions docs/sequence_qc/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# sequence_qc

61 changes: 61 additions & 0 deletions docs/sequence_qc/sequence_qc_v0.2.24.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# CWL and Dockerfile for running sequence_qc

## Version of tools in docker image (/container/Dockerfile)

| Tool | Version | Location |
|--- |--- |--- |
| sequence_qc | 0.2.24 | <https://github.org/msk-access/sequence_qc/> |

## CWL

- CWL specification 1.0
- Use example_inputs.yaml to see the inputs to the cwl
- Example Command using [toil](https://toil.readthedocs.io):

```bash
> toil-cwl-runner sequence_qc_0.2.24.cwl example_inputs.yaml
```

**If at MSK, using the JUNO cluster having installed toil version 3.19 and manually modifying [lsf.py](https://github.com/DataBiosphere/toil/blob/releases/3.19.0/src/toil/batchSystems/lsf.py#L170) by removing `type==X86_64 &&` you can use the following command**

```bash
#Using CWLTOOL
> cwltool --singularity --non-strict /path/to/sequence_qc/0.2.24/sequence_qc_0.2.24.cwl /path/to/inputs.yaml

#Using toil-cwl-runner
> mkdir tool_toil_log
> toil-cwl-runner --singularity --logFile /path/to/tool_toil_log/cwltoil.log --jobStore /path/to/tool_jobStore --batchSystem lsf --workDir /path/to/tool_toil_log --outdir . --writeLogs /path/to/tool_toil_log --logLevel DEBUG --stats --retryCount 2 --disableCaching --maxLogFileSize 20000000000 /path/to/sequence_qc/0.2.24/sequence_qc_0.2.24.cwl /path/to/inputs.yaml > tool_toil.stdout 2> tool_toil.stderr &
```

### Usage

```bash
toil-cwl-runner sequence_qc_0.2.24.cwl -h

usage: sequence_qc_0.2.24.cwl [-h] --reference REFERENCE --bam_file BAM_FILE
--bed_file BED_FILE --sample_id SAMPLE_ID
[--threshold THRESHOLD] [--truncate TRUNCATE]
[--min_mapq MIN_MAPQ] [--min_basq MIN_BASQ]
[job_order]

positional arguments:
job_order Job input json file

optional arguments:
-h, --help show this help message and exit
--reference REFERENCE
Path to reference fasta, containing all regions in
bed_file
--bam_file BAM_FILE Path to BAM file for calculating noise [required]
--bed_file BED_FILE Path to BED file containing regions over which to
calculate noise [required]
--sample_id SAMPLE_ID
Prefix to include in all output file names
--threshold THRESHOLD
Alt allele frequency past which to ignore positions
from the calculation.
--truncate TRUNCATE Whether to exclude trailing bases from reads that only
partially overlap the bed file (0 or 1)
--min_mapq MIN_MAPQ Exclude reads with a lower mapping quality
--min_basq MIN_BASQ Exclude bases with a lower base quality
```
27 changes: 8 additions & 19 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,20 +1,9 @@
toil-ionox0[cwl]==0.0.7
toil[cwl]
pytz
typing==3.7.4

# From fixing pkg_resources.ContextualVersionConflict:
ruamel.yaml==0.15.77

# From requirements_dev
pip>=21.1
bumpversion==0.5.3
wheel==0.32.1
watchdog==0.9.0
flake8==3.5.0
tox==3.5.2
coverage==4.5.1
Sphinx==1.8.1
twine==1.12.1
pytest==3.8.2
pytest-runner==4.2
coloredlogs==10.0.0
bumpversion
flake8
tox
twine
pytest
pytest-runner
coloredlogs
150 changes: 150 additions & 0 deletions sequence_qc/0.2.4/sequence_qc_0.2.4.cwl
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
class: CommandLineTool
cwlVersion: v1.0
$namespaces:
dct: 'http://purl.org/dc/terms/'
doap: 'http://usefulinc.com/ns/doap#'
foaf: 'http://xmlns.com/foaf/0.1/'
sbg: 'https://www.sevenbridges.com/'
id: calculate_noise_0_2_4
baseCommand:
- calculate_noise
inputs:
- id: reference
type: File
inputBinding:
position: 0
prefix: --ref_fasta
secondaryFiles:
- ^.fasta.fai
doc: >-
Path to reference fasta, containing all regions in bed_file
- id: bam_file
type: File
inputBinding:
position: 0
prefix: --bam_file
secondaryFiles:
- ^.bai
doc: >-
Path to BAM file for calculating noise [required]
- id: bed_file
type: File
inputBinding:
position: 0
prefix: --bed_file
doc: >-
Path to BED file containing regions over which to calculate noise [required]
- id: sample_id
type: string
inputBinding:
position: 0
prefix: --sample_id
doc: >-
Prefix to include in all output file names
- id: threshold
type: float?
inputBinding:
position: 0
prefix: --threshold
doc: >-
Alt allele frequency past which to ignore positions from the calculation.
- id: truncate
type: int?
inputBinding:
position: 0
prefix: --truncate
doc: >-
Whether to exclude trailing bases from reads that only partially overlap the bed file (0 or 1)
- id: min_mapq
type: int?
inputBinding:
position: 0
prefix: --min_mapq
doc: >-
Exclude reads with a lower mapping quality
- id: min_basq
type: int?
inputBinding:
position: 0
prefix: --min_basq
doc: >-
Exclude bases with a lower base quality
outputs:
- id: sequence_qc_pileup
type: File
outputBinding:
glob: |-
${
return inputs.sample_id + '_pileup.tsv'
}
- id: sequence_qc_noise_positions
type: File
outputBinding:
glob: |-
${
return inputs.sample_id + '_noise_positions.tsv'
}
- id: sequence_qc_noise_by_substitution
type: File
outputBinding:
glob: |-
${
return inputs.sample_id + '_noise_by_substitution.tsv'
}
- id: sequence_qc_noise_acgt
type: File
outputBinding:
glob: |-
${
return inputs.sample_id + '_noise_acgt.tsv'
}
- id: sequence_qc_noise_n
type: File
outputBinding:
glob: |-
${
return inputs.sample_id + '_noise_n.tsv'
}
- id: sequence_qc_noise_del
type: File
outputBinding:
glob: |-
${
return inputs.sample_id + '_noise_del.tsv'
}
- id: sequence_qc_figures
type: File
outputBinding:
glob: |-
${
return inputs.sample_id + '_noise.html'
}
requirements:
- class: ResourceRequirement
ramMin: 8000
coresMin: 1
- class: DockerRequirement
dockerPull: 'ghcr.io/msk-access/sequence_qc:0.2.4'
- class: InlineJavascriptRequirement
- class: EnvVarRequirement
envDef:
LC_ALL: en_US.utf-8
LANG: en_US.utf-8
'dct:contributor':
- class: 'foaf:Organization'
'foaf:member':
- class: 'foaf:Person'
'foaf:mbox': 'mailto:shahr2@mskcc.org'
'foaf:name': Ronak Shah
'foaf:name': Memorial Sloan Kettering Cancer Center
'dct:creator':
- class: 'foaf:Organization'
'foaf:member':
- class: 'foaf:Person'
'foaf:mbox': 'mailto:murphyc4@mskcc.org'
'foaf:name': Charlie Murphy
'foaf:name': Memorial Sloan Kettering Cancer Center
'doap:release':
- class: 'doap:Version'
'doap:name': sesquence_qc
'doap:revision': 0.2.4
12 changes: 6 additions & 6 deletions sequence_qc/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

| Tool | Version | Location |
|--- |--- |--- |
| sequence_qc | 0.1.19 | <https://pypi.org/project/sequence-qc/> |
| sequence_qc | 0.2.24 | <https://github.org/msk-access/sequence_qc/> |

## CWL

Expand All @@ -13,26 +13,26 @@
- Example Command using [toil](https://toil.readthedocs.io):

```bash
> toil-cwl-runner sequence_qc_0.1.19.cwl example_inputs.yaml
> toil-cwl-runner sequence_qc_0.2.24.cwl example_inputs.yaml
```

**If at MSK, using the JUNO cluster having installed toil version 3.19 and manually modifying [lsf.py](https://github.com/DataBiosphere/toil/blob/releases/3.19.0/src/toil/batchSystems/lsf.py#L170) by removing `type==X86_64 &&` you can use the following command**

```bash
#Using CWLTOOL
> cwltool --singularity --non-strict /path/to/sequence_qc_0.1.19/sequence_qc_0.1.19.cwl /path/to/inputs.yaml
> cwltool --singularity --non-strict /path/to/sequence_qc/0.2.24/sequence_qc_0.2.24.cwl /path/to/inputs.yaml

#Using toil-cwl-runner
> mkdir tool_toil_log
> toil-cwl-runner --singularity --logFile /path/to/tool_toil_log/cwltoil.log --jobStore /path/to/tool_jobStore --batchSystem lsf --workDir /path/to/tool_toil_log --outdir . --writeLogs /path/to/tool_toil_log --logLevel DEBUG --stats --retryCount 2 --disableCaching --maxLogFileSize 20000000000 /path/to/sequence_qc_0.1.19/sequence_qc_0.1.19.cwl /path/to/inputs.yaml > tool_toil.stdout 2> tool_toil.stderr &
> toil-cwl-runner --singularity --logFile /path/to/tool_toil_log/cwltoil.log --jobStore /path/to/tool_jobStore --batchSystem lsf --workDir /path/to/tool_toil_log --outdir . --writeLogs /path/to/tool_toil_log --logLevel DEBUG --stats --retryCount 2 --disableCaching --maxLogFileSize 20000000000 /path/to/sequence_qc/0.2.24/sequence_qc_0.2.24.cwl /path/to/inputs.yaml > tool_toil.stdout 2> tool_toil.stderr &
```

### Usage

```bash
toil-cwl-runner sequence_qc_0.1.19.cwl -h
toil-cwl-runner sequence_qc_0.2.24.cwl -h

usage: sequence_qc_0.1.19.cwl [-h] --reference REFERENCE --bam_file BAM_FILE
usage: sequence_qc_0.2.24.cwl [-h] --reference REFERENCE --bam_file BAM_FILE
--bed_file BED_FILE --sample_id SAMPLE_ID
[--threshold THRESHOLD] [--truncate TRUNCATE]
[--min_mapq MIN_MAPQ] [--min_basq MIN_BASQ]
Expand Down

0 comments on commit cf66cd5

Please sign in to comment.