Skip to content

Latest commit

 

History

History
84 lines (62 loc) · 3.08 KB

README.md

File metadata and controls

84 lines (62 loc) · 3.08 KB

TF-activity workflow

Pipeline to infer active transcription factors from CAGE-seq data using transcription factor binding motifs.

Nextflow Docker Repository on Dockerhub Packagist

Description

Given a set of CAGE-peaks of interest (e.g. up-regulated in some condition), this pipeline will extract the genomic sequence in a given range around these peaks and look for enrichment of TF motifs in these sequences. As background either the shuffled input sequences will be used or a background extracted from user supplied CAGE-peaks that are not of interest.

The pipeline will annotate the peaks and create TF-gene mappings. Additionally, the CAGE-peaks are overlapped with ChIP-seq peaks extracted from the ENCODE project.

Usage

As minimum input for this pipeline you will need:

  • CAGE peaks of interest in bed format
  • fasta of reference genome
  • gtf of reference genome

--fasta

Used to specify the path to the reference genome.

--gtf

Used to specify path to the GTF file of the reference genome.

--peaks

Your cage peaks of interest in BED file format.

--background

Peaks that are not differentially expressed can be used here as background peaks. They have to be in BED format just like your peaks of interest.

--pfms

A file containing all the TF motifs to use. This file must be in homer format. If you do not have a file in homer format, you can instead specifiy a TF motif file in Jaspar format using the --pfms_jaspar flag. If none of these two flags is set, the pipeline will use all motifs from the Jaspar core collection.

--encode

A directory containing BED peak files from ChIP-seq experiments with transcription factors. They will be used for intersection. If this flag is not set, this step will be let out.

-profile

Which profile to use. Use docker to use the docker container provided. It is best to create your own profile config file. You can look at the files in /conf for examples, and create one yourself. Then you have to reference it in the nextflow.config file like so:

profiles {

    standard
    {
        includeConfig 'conf/base.config'
    }
    docker
    {
        includeConfig 'conf/base.config'
        includeConfig 'conf/docker.config'
    }
    my_profile
    {
        includeConfig 'conf/base.config'
        includeConfig 'conf/my_profile.config'
    }

}

Additional input options:

Example pipeline call:

nextflow run kevinmenden/tf-activity -profile docker --fasta path/to/genome.fa --gtf path/to/gtf/genome.gtf \
--peaks peaks.bed --background background.bed

In the above example, the pipeline will use the docker image from dockerhub to run. Because no motifs are specified, the default Jaspar core motifs will be used.