A pipeline for the detection and mapping of long non-coding RNAs
Make sure you have installed the required dependencies listed below. Clone the git repository on your computer with the following command:
$ git clone [email protected]:cbcrg/piper-nf.git
When done, move in the project root folder just created piper-nf
,
it contains an example dataset in the tutorial
folder.
Launch the pipeline execution by entering the following command on your shell terminal:
$ ./nextflow piper.nf
By default the pipeline is executed against the provided example dataset. Check the Pipeline parameters section below to see how enter your data on the program command line.
#####query
- The query transcripts file in multi-fasta format
- Example:
nextflow piper.nf --query=/some/path/query.fa
#####genomes-file
- The file listing the full paths to the genomes files
- Example:
nextflow piper.nf --genomes-file=my-genomes.txt
#####genomes-db
- The location where the BLAST formatted DB are stored
- Example:
nextflow piper.nf --genomes-db=/my/db/path
#####query-chunk-size
- Number of sequences in each chunck in which is sliced the query file
- Example:
nextflow piper.nf --query-chunk-size=50
#####result-dir
- The location where the result files are stored.
- Please note: if the folder exists, the all existing content will be deleted without further notice
- Example:
nextflow piper.nf --result-dir=./my-result/
#####blast-strategy
- Which BLAST program to be used,
ncbi-blast
(default) orwu-blast
- Example:
nextflow piper.nf --blast-strategy=wu-blast
To avoid having to install all the pipeline dependencies, you may test the pipeline using the provided Vagrant VM, which downloads and configures all the required pieces of software for you. See http://www.vagrantup.com for more details about Vagrant.
The Vagrant environment uses the Ubuntu Precise 64 virtual machine, if you don't have it in your Vagrant boxes list, it will be downloaded automatically.
To launch the VM move to the pipeline root folder grape-nf
and enter the following command:
$ vagrant up
When it boots up and the configuration steps are terminated, login into the VM instance and move to the Grape pipeline folder
$ vagrant ssh
$ cd /vagrant
Now you can launch the pipeline as shown:
$ ./nextflow piper.nf
When finished, stop the VM using the command vagrant halt
or vagrant destroy
, depending if you
want to temporary stop the execution or delete permanently the VM with all its files.
Piper-NF execution relies on Nextflow framework which provides an abstraction between the pipeline functional logic and the underlying processing system.
Thus it is possible to execute it on your computer or any cluster resource manager without modifying it.
Currently the following clusters are supported:
- Oracle Grid Engine (SGE)
- SLURM (beta)
- Platform LSF (beta)
By default the pipeline is parallelized by spanning multiple threads in the machine where the script is launched.
To submit the execution to a SGE cluster create a file named nextflow.config
, in the directory
where the pipeline is going to be launched, with the following content:
task {
processor='sge'
queue='<your queue name>'
}
In doing that, tasks will be executed through the qsub
SGE command, and so your pipeline will behave like any
other SGE job script, with the benefit that Nextflow will automatically and transparently manage the tasks
synchronisation, file(s) staging/un-staging, etc.
Alternatively the same declaration can be defined in the file $HOME/.nextflow/config
.
- NCBI BLAST+ - http://blast.ncbi.nlm.nih.gov/
- T-Coffee - http://tcoffee.org
- Exonerate 2.2 - http://www.ebi.ac.uk/~guy/exonerate/
- chr_subseq - unknown author
- gnu-sed (Mac OSX only)
- gnu-coreutils (Mac OSX only)