This project manages all the software components related to the Coral Reef Global Search project.
gs_submit <config-file>
Starting point point for the pipeline
- Slurm cluster
- STAR (https://github.com/alexdobin/STAR)
- salmon 0.13.1 (https://combine-lab.github.io/salmon/)
- [Kallisto (https://pachterlab.github.io/kallisto/)]
- TrimGalore (https://github.com/FelixKrueger/TrimGalore)
- samtools (http://www.htslib.org/)
- htseq (https://github.com/htseq/htseq)
- [Picard tools (https://broadinstitute.github.io/picard/)]
- Python 3 (==3.10), or Anaconda 3
Python:
$ pip install globalsearch
Within R:
$ library('devtools')
$ devtools::install_github('https://github.com/baliga-lab/Global_Search.git', ref="main", subdir="code/rpackage")
Configuration files are in JSON format of the following form
{
"organisms": [<organism 1>, ...],
"input_dir": <input directory>,
"genome_dir": <genome file directory>,
"output_dir": <output directory>,
"postrun_output_dir": "<post-run output directory>",
"log_dir": <log directory>,
"genome_gff": <GFF file>,
"genome_fasta": <FASTA file path>,
"fastq_patterns": ["*_{{readnum}}.fq.*", "*_{{readnum}}.fastq.*"],
"includes": [<directory name],
"include_file": <path to file containing included directories>,
"deduplicate_bam_files": false,
"rnaseq_algorithm": "star_salmon",
"star_options": {
"outFilterMismatchNmax": 10,
"outFilterMismatchNoverLmax": 0.3,
"outFilterScoreMinOverLread": 0.66,
"outFilterMatchNmin": 0,
"twopassMode": false
},
"star_index_options": {
"runThreadN": 32,
"genomeChrBinNbits": 16,
"genomeSAindexNbases": 12
},
"sbatch_options": {
<pipeline_step_name>: {
"options": [
<sbatch options>
],
"extras": [
<additional lines for slurm job script>
]
}
}
}