Skip to content
This repository has been archived by the owner on Apr 11, 2022. It is now read-only.

GuilleGorines/nfcore-pikavirus-legacy

Repository files navigation

nf-core/pikavirus

A workflow for metagenomics.

GitHub Actions CI Status GitHub Actions Linting Status Nextflow

install with bioconda Docker Get help on Slack

Introduction

nf-core/pikavirus is a bioinformatics best-practise analysis pipeline for metagenomic analysis following a new approach, based on eliminatory k-mer analysis, followed by assembly and posterior contig-binning.

The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker containers making installation trivial and results highly reproducible.

Quick Start

  1. Install nextflow

  2. Install any of Docker, Singularity or Podman for full pipeline reproducibility (please only use Conda as a last resort; see docs)

  3. Download the pipeline and test it on a minimal dataset with a single command:

    nextflow run nf-core/pikavirus -profile test,<docker/singularity/podman/conda/institute>

    Please check nf-core/configs to see if a custom config file to run nf-core pipelines already exists for your Institute. If so, you can simply use -profile <institute> in your command. This will enable either docker or singularity and set the appropriate execution settings for your local compute environment.

  4. Start running your own analysis!

    nextflow run nf-core/pikavirus -profile <docker/singularity/podman/conda> --input '*_R{1,2}.fastq.gz'

See usage docs for all of the available options when running the pipeline.

Pipeline Summary

By default, the pipeline currently performs the following:

  • Sequencing quality control (FastQC)
  • Trimming of low-quality regions in the reads (FastP)
  • Identification and elimination of reads from the host, and isolation of viral, bacterial, fungal and unknown reads (Kraken2)
  • Assembly of unknow reads (MetaQuast) and mapping against databases (Kaiju) to identify new possible pathogens
  • Alignment of viral, bacterial and fungal reads against reference genomes to ensure the presence of certain organisms (Bowtie2)

Documentation

The nf-core/pikavirus pipeline comes with documentation about the pipeline: usage and output.

Credits

nf-core/pikavirus was originally written by Guillermo Jorge Gorines Cordero, under supervision of the BU-ISCIII team in Madrid, Spain.

We thank the following people for their extensive assistance in the development of this pipeline:

Contributions and Support

If you would like to contribute to this pipeline, please see the contributing guidelines.

For further information or help, don't hesitate to get in touch on the Slack #pikavirus channel (you can join with this invite).

Citations

You can cite the nf-core publication as follows:

The nf-core framework for community-curated bioinformatics pipelines.

Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.

Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x. ReadCube: Full Access Link

In addition, references of tools and data used in this pipeline are as follows:

Improved metagenomic analysis with Kraken 2.

Derrick E Wood, Jennifer Lu & Ben Langmead.

Genome biology 2019 Nov 28. doi: 10.1186/s13059-019-1891-0

fastp: an ultra-fast all-in-one FASTQ preprocessor.

Shifu Chen, Yanqing Zhou, Yaru Chen, Jia Gu.

Bioinformatics, Volume 34, Issue 17, 01 September 2018, Pages i884–i890,. doi: 10.1093/bioinformatics/bty560

Fast and sensitive taxonomic classification for metagenomics with Kaiju

Peter Menzel, Kim Lee Ng & Anders Krogh

Nature Communications volume 7, Article number: 11257 (2016). doi 10.1038/ncomms11257

QUAST: quality assessment tool for genome assemblies

Alexey Gurevich, Vladislav Saveliev, Nikolay Vyahhi & Glenn Tesler

Bioinformatics Volume 29, Issue 8, 15 April 2013, Pages 1072–1075. doi 10.1093/bioinformatics/btt086

Bioconda: sustainable and comprehensive software distribution for the life sciences

Björn Grüning, Ryan Dale, Andreas Sjödin, Brad A. Chapman, Jillian Rowe, Christopher H. Tomkins-Tinch, Renan Valieris, Johannes Köster & The Bioconda Team

Nature Methods volume 15, pages 475–476(2018). doi 10.1038/s41592-018-0046-7

Mash: fast genome and metagenome distance estimation using MinHash

Brian D. Ondov, Todd J. Treangen, Páll Melsted, Adam B. Mallonee, Nicholas H. Bergman, Sergey Koren & Adam M. Phillippy

Genome Biology 17, Article number: 132 (2016). doi 10.1186/s13059-016-0997-x

About

No description, website, or topics provided.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published