Skip to content

Commit

Permalink
Feature/test modify cwls (#57)
Browse files Browse the repository at this point in the history
* 🔧 📘  Fixing AddOrReplaceReadgroup

☑️ Commenting `ramMin`
☑️ Commenting `coresMin`
☑️ Adding stating `ramMin`
☑️ Adding stating `coresMin`
☑️ reformatting the javascript for `Xmx`
☑️ Adding things to README

* 🔧 Remove the extra doap cwl-wrapper tag

* 🔧 Fixing Javascript

* 🔧 Indentation issue

* 🔧 Adding secondary file to input for AddOrReplaceReadGroup

* 📘 Adding more usage information for AddOrReplaceReadGroups

* 📘  Update README for AddOrReplaceReadGroup

* 📘 Update README AddOrReplaceReadGroups

✔️ Removing Typo
✔️ Adding lsf.py url to line

* 🔧 📘  Modifying Picard_Fix_Mate 1.96

✔️ Commented Dynamic `ramMin` and `coresMin` and added static values.
✔️ better representation of javascript in cwl
✔️ better documentation

* 🔧 Fixing Picard FixMate 1.96

✔️ Javascript indentation issue

* 🔧 Fixing Picard Fixmate 1.96

✔️ Adding secondary file requirement

* 🔧 adding option to have dynamic output name

* 🔧 fixing glob in picard fixmate 1.96

* 🔧 Picard FixMate 1.96

Glob does not have valueFrom

* 🔧 Picard FixMate 1.96

Make output optional

* 🔧 Picard FixMate

* 🔧 Fix PicardFixMate

* 🔧 Picard FixMate 1.96

* 🔧 Picard FixMate 1.96

* 🔧 Fixmate 1.96

* Update picard_fix_mate_information_1.96.cwl

* Update picard_fix_mate_information_1.96.cwl

* Update picard_fix_mate_information_1.96.cwl

* Update picard_fix_mate_information_1.96.cwl

* Update picard_fix_mate_information_1.96.cwl

* 🔧 Picard FixMate 1.96

* Update picard_fix_mate_information_1.96.cwl

* Update picard_fix_mate_information_1.96.cwl

* :wrech: Hopefully Final Fix

* 🔧 📘 Changes to picard for output file name issues

* Update README.md

* Update README.md

* Update README.md

* 🔧 TrimGalore

* 📘

* Update abra2_2.17.cwl
  • Loading branch information
rhshah authored Jul 31, 2019
1 parent 1bd1728 commit a8a3eb5
Show file tree
Hide file tree
Showing 10 changed files with 386 additions and 49 deletions.
69 changes: 68 additions & 1 deletion abra2_2.17/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,71 @@
```bash
> toil-cwl-runner abra2_2.17.cwl example_inputs.yaml
```


**If at MSK, using the JUNO cluster having installed toil version 3.19 and manually modifying [lsf.py](https://github.com/DataBiosphere/toil/blob/releases/3.19.0/src/toil/batchSystems/lsf.py#L170) by removing `type==X86_64 &&` you can use the following command**

```bash
#Using CWLTOOL
> cwltool --singularity --non-strict /path/to/abra2_2.17.cwl /path/to/inputs.yaml

#Using toil-cwl-runner
> mkdir abra2_toil_log
> toil-cwl-runner --singularity --logFile /path/to/abra2_toil_log/cwltoil.log --jobStore /path/to/abra2_jobStore --batchSystem lsf --workDir /path/to/abra2_toil_log --outdir . --writeLogs /path/to/abra2_toil_log --logLevel DEBUG --stats --retryCount 2 --disableCaching --maxLogFileSize 20000000000 /path/to/abra2_2.17.cwl /path/to/inputs.yaml > abra2_toil.stdout 2> abra2_toil.stderr &
```

### Usage

```
usage: abra2_2.17.cwl [-h]
positional arguments:
job_order Job input json file
optional arguments:
-h, --help show this help message and exit
--memory_per_job MEMORY_PER_JOB
Memory per job in megabytes
--memory_overhead MEMORY_OVERHEAD
Memory overhead per job in megabytes
--number_of_threads NUMBER_OF_THREADS
--working_directory WORKING_DIRECTORY
Set the temp directory (overrides java.io.tmpdir)
--reference_fasta REFERENCE_FASTA
Genome reference location
--targets TARGETS
--kmer_size KMER_SIZE
Optional assembly kmer size(delimit with commas if
multiple sizes specified)
--maximum_average_depth MAXIMUM_AVERAGE_DEPTH
Regions with average depth exceeding this value will
be downsampled (default: 1000)
--soft_clip_contig SOFT_CLIP_CONTIG
Soft clip contig args [max_contigs,min_base_qual,frac_
high_qual_bases,min_soft_clip_len]
(default:16,13,80,15)
--maximum_mixmatch_rate MAXIMUM_MIXMATCH_RATE
Max allowed mismatch rate when mapping reads back to
contigs (default: 0.05)
--scoring_gap_alignments SCORING_GAP_ALIGNMENTS
Scoring used for contig alignments(match,
mismatch_penalty,gap_open_penalty,gap_extend_penalty)
(default:8,32,48,1)
--contig_anchor CONTIG_ANCHOR
Contig anchor
[M_bases_at_contig_edge,max_mismatches_near_edge]
(default:10,2)
--window_size WINDOW_SIZE
Processing window size and overlap (size,overlap)
(default: 400,200)
--consensus_sequence Use positional consensus sequence when aligning high
quality soft clipping
--ignore_bad_assembly
Use this option to avoid parsing errors for corrupted
assemblies
--bam_index Enable BAM index generation when outputting sorted
alignments (may require additonal memory)
--input_vcf INPUT_VCF
VCF containing known (or suspected) variant sites.
Very large files should be avoided.
--no_sort Do not attempt to sort final output
```
9 changes: 4 additions & 5 deletions abra2_2.17/abra2_2.17.cwl
Original file line number Diff line number Diff line change
Expand Up @@ -170,8 +170,10 @@ arguments:
valueFrom: /usr/local/bin/abra2.jar
requirements:
- class: ResourceRequirement
ramMin: "${\r if(inputs.memory_per_job && inputs.memory_overhead) {\r \r return inputs.memory_per_job + inputs.memory_overhead\r }\r else if (inputs.memory_per_job && !inputs.memory_overhead){\r \r \treturn inputs.memory_per_job + 2000\r }\r else if(!inputs.memory_per_job && inputs.memory_overhead){\r \r return 15000 + inputs.memory_overhead\r }\r else {\r \r \treturn 17000 \r }\r}"
coresMin: "${\r if (inputs.number_of_threads) {\r \r \treturn inputs.number_of_threads \r }\r else {\r \r return 4\r }\r}"
ramMin: 48000
coresMin: 4
#ramMin: "${\r if(inputs.memory_per_job && inputs.memory_overhead) {\r \r return inputs.memory_per_job + inputs.memory_overhead\r }\r else if (inputs.memory_per_job && !inputs.memory_overhead){\r \r \treturn inputs.memory_per_job + 2000\r }\r else if(!inputs.memory_per_job && inputs.memory_overhead){\r \r return 15000 + inputs.memory_overhead\r }\r else {\r \r \treturn 17000 \r }\r}"
#coresMin: "${\r if (inputs.number_of_threads) {\r \r \treturn inputs.number_of_threads \r }\r else {\r \r return 4\r }\r}"
- class: DockerRequirement
dockerPull: 'mskcc/abra2:0.1.0'
- class: InlineJavascriptRequirement
Expand All @@ -193,6 +195,3 @@ requirements:
- class: 'doap:Version'
'doap:name': abra2
'doap:revision': 2.17
- class: 'doap:Version'
'doap:name': cwl-wrapper
'doap:revision': 1.0.0
74 changes: 74 additions & 0 deletions picard_add_or_replace_read_groups_1.96/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
| picard | 1.96 | https://sourceforge.net/projects/picard/files/picard-tools/1.96/picard-tools-1.96.zip |
| R | 3.3.3 | r-base for opnejdk:8 |

[![](https://images.microbadger.com/badges/image/mskcc/picard_1.96:0.1.0.svg)](https://microbadger.com/images/mskcc/picard_1.96:0.1.0 "Get your own image badge on microbadger.com") [![](https://images.microbadger.com/badges/version/mskcc/picard_1.96:0.1.0.svg)](https://microbadger.com/images/mskcc/picard_1.96:0.1.0 "Get your own version badge on microbadger.com") [![](https://images.microbadger.com/badges/license/mskcc/picard_1.96:0.1.0.svg)](https://microbadger.com/images/mskcc/picard_1.96:0.1.0 "Get your own license badge on microbadger.com")

## CWL

Expand All @@ -18,3 +19,76 @@
```bash
> toil-cwl-runner picard_add_or_replace_read_groups_1.96.cwl example_inputs.yaml
```

**If at MSK, using the JUNO cluster having installed toil version 3.19 and manually modifying [lsf.py](https://github.com/DataBiosphere/toil/blob/releases/3.19.0/src/toil/batchSystems/lsf.py#L170) by removing `type==X86_64 &&` you can use the following command**

```bash
#Using CWLTOOL
> cwltool --singularity --non-strict /path/to/picard_add_or_replace_read_groups_1.96/picard_add_or_replace_read_groups_1.96.cwl /path/to/inputs.yaml

#Using toil-cwl-runner
> mkdir picardAddOrReplaceReadGroup_toil_log
> toil-cwl-runner --singularity --logFile /path/to/picardAddOrReplaceReadGroup_toil_log/cwltoil.log --jobStore /path/to/picardAddOrReplaceReadGroup_jobStore --batchSystem lsf --workDir /path/to picardAddOrReplaceReadGroup_toil_log --outdir . --writeLogs /path/to/picardAddOrReplaceReadGroup_toil_log --logLevel DEBUG --stats --retryCount 2 --disableCaching --maxLogFileSize 20000000000 /path/to/picard_add_or_replace_read_groups_1.96/picard_add_or_replace_read_groups_1.96.cwl /path/to/inputs.yaml > picardAddOrReplaceReadGroup_toil.stdout 2> picardAddOrReplaceReadGroup_toil.stderr &
```

### Usage

```bash
> toil-cwl-runner picard_add_or_replace_read_groups_1.96.cwl --help
usage: picard_add_or_replace_read_groups_1.96.cwl [-h]

positional arguments:
job_order Job input json file

optional arguments:
-h, --help show this help message and exit
--memory_per_job MEMORY_PER_JOB
Memory per job in megabytes
--memory_overhead MEMORY_OVERHEAD
Memory overhead per job in megabytes
--number_of_threads NUMBER_OF_THREADS
--input INPUT Input file (bam or sam). Required.
--output_file_name OUTPUT_FILE_NAME
Output file name (bam or sam). Not Required
--sort_order SORT_ORDER
Optional sort order to output in. If not supplied
OUTPUT is in the same order as INPUT.Default value:
null. Possible values: {unsorted, queryname,
coordinate}
--read_group_identifier READ_GROUP_IDENTIFIER
Read Group ID Default value: 1. This option can be set
to 'null' to clear the default value Required
--read_group_sequnecing_center READ_GROUP_SEQUNECING_CENTER
Read Group sequencing center name Default value: null.
Required
--read_group_library READ_GROUP_LIBRARY
Read Group Library. Required
--read_group_platform_unit READ_GROUP_PLATFORM_UNIT
Read Group platform unit (eg. run barcode) Required.
--read_group_sample_name READ_GROUP_SAMPLE_NAME
Read Group sample name. Required
--read_group_sequencing_platform READ_GROUP_SEQUENCING_PLATFORM
Read Group platform (e.g. illumina, solid) Required.
--read_group_description READ_GROUP_DESCRIPTION
Read Group description Default value: null.
--read_group_run_date READ_GROUP_RUN_DATE
Read Group run date Default value: null.
--tmp_dir TMP_DIR This option may be specified 0 or more times
--validation_stringency VALIDATION_STRINGENCY
Validation stringency for all SAM files read by this
program. Setting stringency to SILENT can improve
performance when processing a BAM file in which
variable-length data (read, qualities, tags) do not
otherwise need to be decoded. Default value: STRICT.
This option can be set to 'null' to clear the default
value. Possible values: {STRICT,LENIENT, SILENT}
--bam_compression_level BAM_COMPRESSION_LEVEL
Compression level for all compressed files created
(e.g. BAM and GELI). Default value:5. This option can
be set to 'null' to clear the default value.
--create_bam_index Whether to create a BAM index when writing a
coordinate-sorted BAM file. Default value:false. This
option can be set to 'null' to clear the default
value. Possible values:{true, false}
```

2 changes: 1 addition & 1 deletion picard_add_or_replace_read_groups_1.96/example_inputs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ input:
memory_overhead:
memory_per_job:
number_of_threads:
output: somename_srt.bam
output_file_name: somename_srt.bam
read_group_description:
read_group_identifier: test
read_group_library: 1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,14 +23,11 @@ inputs:
prefix: I=
separate: false
doc: Input file (bam or sam). Required.
- id: output
type: string
inputBinding:
position: 0
prefix: O=
separate: false
valueFrom: '$(inputs.input.basename.replace(/.sam |.bam/, ''_srt.bam''))'
doc: Output file (bam or sam).
secondaryFiles:
- ^.bai
- id: output_file_name
type: string?
doc: Output file name (bam or sam). Not Required
- id: sort_order
type: string?
inputBinding:
Expand Down Expand Up @@ -142,20 +139,62 @@ outputs:
- id: bam
type: File
outputBinding:
glob: '$(inputs.input.basename.replace(/.sam |.bam/, ''_srt.bam''))'
glob: |-
${
if(inputs.output_file_name){
return inputs.output_file_name
} else {
return inputs.input.basename.replace(/.sam | .bam/,'_srt.bam')
}
}
secondaryFiles:
- ^.bai
label: picard_add_or_replace_read_groups_1.96
arguments:
- position: 0
valueFrom: "${\n if(inputs.memory_per_job && inputs.memory_overhead) {\n \n if(inputs.memory_per_job % 1000 == 0) {\n \t\n return \"-Xmx\" + (inputs.memory_per_job/1000).toString() + \"G\"\n }\n else {\n \n return \"-Xmx\" + Math.floor((inputs.memory_per_job/1000)).toString() + \"G\" \n }\n }\n else if (inputs.memory_per_job && !inputs.memory_overhead){\n \n if(inputs.memory_per_job % 1000 == 0) {\n \t\n return \"-Xmx\" + (inputs.memory_per_job/1000).toString() + \"G\"\n }\n else {\n \n return \"-Xmx\" + Math.floor((inputs.memory_per_job/1000)).toString() + \"G\" \n }\n }\n else if(!inputs.memory_per_job && inputs.memory_overhead){\n \n return \"-Xmx15G\"\n }\n else {\n \n \treturn \"-Xmx15G\"\n }\n}"
valueFrom: |-
${
if(inputs.memory_per_job && inputs.memory_overhead) {
if(inputs.memory_per_job % 1000 == 0) {
return "-Xmx" + (inputs.memory_per_job/1000).toString() + "G"
}
else {
return "-Xmx" + Math.floor((inputs.memory_per_job/1000)).toString() + "G"
}
}
else if (inputs.memory_per_job && !inputs.memory_overhead){
if(inputs.memory_per_job % 1000 == 0) {
return "-Xmx" + (inputs.memory_per_job/1000).toString() + "G"
}
else {
return "-Xmx" + Math.floor((inputs.memory_per_job/1000)).toString() + "G"
}
}
else if(!inputs.memory_per_job && inputs.memory_overhead){
return "-Xmx15G"
}
else {
return "-Xmx15G"
}
}
- position: 0
prefix: '-jar'
valueFrom: /usr/local/bin/AddOrReplaceReadGroups.jar
- position: 0
prefix: O=
separate: false
valueFrom: |-
${
if(inputs.output_file_name){
return inputs.output_file_name
} else {
return inputs.input.basename.replace(/.sam | .bam/,'_srt.bam')
}
}
requirements:
- class: ResourceRequirement
ramMin: "${\r if(inputs.memory_per_job && inputs.memory_overhead) {\r \r return inputs.memory_per_job + inputs.memory_overhead\r }\r else if (inputs.memory_per_job && !inputs.memory_overhead){\r \r \treturn inputs.memory_per_job + 2000\r }\r else if(!inputs.memory_per_job && inputs.memory_overhead){\r \r return 15000 + inputs.memory_overhead\r }\r else {\r \r \treturn 17000 \r }\r}"
coresMin: "${\r if (inputs.number_of_threads) {\r \r \treturn inputs.number_of_threads \r }\r else {\r \r return 2\r }\r}"
ramMin: 16000
coresMin: 2
- class: DockerRequirement
dockerPull: 'mskcc/picard_1.96:0.1.0'
- class: InlineJavascriptRequirement
Expand All @@ -177,6 +216,3 @@ requirements:
- class: 'doap:Version'
'doap:name': picard
'doap:revision': 1.96
- class: 'doap:Version'
'doap:name': cwl-wrapper
'doap:revision': 1.0.0
57 changes: 56 additions & 1 deletion picard_fix_mate_information_1.96/README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@
# CWL and Dockerfile for running Picard - FixMateInformation

## Version of tools in docker image (/container/Dockerfile)
## Version of tools in docker image (../picard_add_or_replace_read_groups_1.96/container/Dockerfile)

| Tool | Version | Location |
|--- |--- |--- |
| java base image | 8 | - |
| picard | 1.96 | https://sourceforge.net/projects/picard/files/picard-tools/1.96/picard-tools-1.96.zip |
| R | 3.3.3 | r-base for opnejdk:8 |

[![](https://images.microbadger.com/badges/image/mskcc/picard_1.96:0.1.0.svg)](https://microbadger.com/images/mskcc/picard_1.96:0.1.0 "Get your own image badge on microbadger.com") [![](https://images.microbadger.com/badges/version/mskcc/picard_1.96:0.1.0.svg)](https://microbadger.com/images/mskcc/picard_1.96:0.1.0 "Get your own version badge on microbadger.com") [![](https://images.microbadger.com/badges/license/mskcc/picard_1.96:0.1.0.svg)](https://microbadger.com/images/mskcc/picard_1.96:0.1.0 "Get your own license badge on microbadger.com")

## CWL

Expand All @@ -18,3 +19,57 @@
```bash
> toil-cwl-runner picard_fix_mate_information_1.96.cwl example_inputs.yaml
```

**If at MSK, using the JUNO cluster having installed toil version 3.19 and manually modifying [lsf.py](https://github.com/DataBiosphere/toil/blob/releases/3.19.0/src/toil/batchSystems/lsf.py#L170) by removing `type==X86_64 &&` you can use the following command**

```bash
#Using CWLTOOL
> cwltool --singularity --non-strict /path/to/picard_fix_mate_information_1.96/picard_fix_mate_information_1.96.cwl /path/to/inputs.yaml

#Using toil-cwl-runner
> mkdir picardFixMate_toil_log
> toil-cwl-runner --singularity --logFile /path/to/picardFixMate_toil_log/cwltoil.log --jobStore /path/to/picardFixMate_jobStore --batchSystem lsf --workDir /path/to picardFixMate_toil_log --outdir . --writeLogs /path/to/picardFixMate_toil_log --logLevel DEBUG --stats --retryCount 2 --disableCaching --maxLogFileSize 20000000000 /path/to/picard_fix_mate_information_1.96/picard_fix_mate_information_1.96.cwl /path/to/inputs.yaml > picardFixMate_toil.stdout 2> picardFixMate_toil.stderr &
```

### Usage

```
usage: picard_fix_mate_information_1.96.cwl [-h]
positional arguments:
job_order Job input json file
optional arguments:
-h, --help show this help message and exit
--memory_per_job MEMORY_PER_JOB
Memory per job in megabytes
--memory_overhead MEMORY_OVERHEAD
Memory overhead per job in megabytes
--number_of_threads NUMBER_OF_THREADS
--input INPUT The input file to fix. This option may be specified 0
or more times
--output_file_name OUTPUT_FILE_NAME
Output file name (bam or sam). Not Required
--sort_order SORT_ORDER
Optional sort order to output in. If not supplied
OUTPUT is in the same order as INPUT.Default value:
null. Possible values: {unsorted, queryname,
coordinate}
--tmp_dir TMP_DIR This option may be specified 0 or more times
--validation_stringency VALIDATION_STRINGENCY
Validation stringency for all SAM files read by this
program. Setting stringency to SILENT can improve
performance when processing a BAM file in which
variable-length data (read, qualities, tags) do not
otherwise need to be decoded. Default value: STRICT.
This option can be set to 'null' to clear the default
value. Possible values: {STRICT,LENIENT, SILENT}
--bam_compression_level BAM_COMPRESSION_LEVEL
Compression level for all compressed files created
(e.g. BAM and GELI). Default value:5. This option can
be set to 'null' to clear the default value.
--create_bam_index Whether to create a BAM index when writing a
coordinate-sorted BAM file. Default value:false. This
option can be set to 'null' to clear the default
value. Possible values:{true, false}
```
2 changes: 1 addition & 1 deletion picard_fix_mate_information_1.96/example_inputs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ input:
memory_overhead:
memory_per_job:
number_of_threads:
output: somename_fm.bam
output_file_name: somename_fm.bam
sort_order:
tmp_dir:
validation_stringency:
Loading

0 comments on commit a8a3eb5

Please sign in to comment.