clustered-phylotree

WDL workflow for split-kmer phylogeny with pre-filtering by hierarchical clustering

This repo contains the workflow used for prototyping a phylogenetic tree workflow that uses SKA split k-mers to create phylogenetic trees for an input set of samples. The pipeline works for complete genomes as well as raw sequences (currently supporting only .fasta inputs).

To run the clustered-phylotree pipeline locally...

First, clone the repo

git clone [email protected]:katrinakalantar/clustered-phylotree.git

Then, build the docker image:

docker build -t clustphylo clustered-phylotree/

We then use the docker image to run the pipeline as follows:

miniwdl run --verbose clustered-phylotree/run.wdl docker_image_id=clustphylo data_directory=full_entero_data.tar.gz cut_height=.14 ska_align_p=.9

note: the pipeline requires as input the data_directory, which is a directory containing .fasta files (one file per sample) which is then tar zipped using the following command:

tar -czf full_entero_data.tar.gz full_entero_data

The optional parameters cut_height and ska_align_p allow you to specify the dendrogram cut height for pre-clustering and the ska alignment proportion parameters, respectively. The default options are shown in the command above.

The /analysis_support directory contains scripts that have been used to support the experimentation and validation of the phylotree pipeline.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
analysis_support		analysis_support
test		test
Dockerfile		Dockerfile
README.md		README.md
run.wdl		run.wdl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

clustered-phylotree

About

Releases

Packages

Languages

katrinakalantar/clustered-phylotree

Folders and files

Latest commit

History

Repository files navigation

clustered-phylotree

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages