Skip to content

Latest commit

 

History

History
24 lines (17 loc) · 1.94 KB

README.md

File metadata and controls

24 lines (17 loc) · 1.94 KB

CobiontID

This repository provides an overview of the pipelines and tools developed to identify cobionts in the Tree of Life Programme. See https://cobiontid.github.io/ for more information!

Software

Standalone tools

Tool Description Application Language
kmer-counter Fast k-mer counter for large read sets Get tetranucleotide counts Rust
unique-kmers Count distinct k-mers in sequences Calculate k-mer diversity Rust
hexamer Detect likely coding regions Estimate coding density C
fastk-medians Calculate median number of times each large k-mer in a sequence occurs across the set (modified version of Profex from the original FASTK library) Approximate k-mer coverage C

Dashboard

A demo of the interactive dashboard to explore read sets is available here. You can also try running the demo on Gitpod. A colab notebook with a more limited feature set and instructions is available here.

Associated publications

Citation

Disentangling Cobionts and Contamination in Long-Read Genomic Data using Sequence Composition https://www.biorxiv.org/content/10.1101/2024.05.30.596622v1

Phylogenomic analysis of Wolbachia genomes from the Darwin Tree of Life biodiversity genomics project https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3001972

MarkerScan: Separation and assembly of cobionts sequenced alongside target species in biodiversity genomics projects https://doi.org/10.12688/wellcomeopenres.20730.1