Skip to content

jennifer-spillane/seq_contamination

Repository files navigation

Exploring the problem of sequence contamination with deep learning

To follow my wanderings through other decontamination/metagenomic software, see dataset_explorations.ipynb.
The three "results" directories contain some results from test datasets that I ran each of the programs with.

For a look at what I did to prep/clean the data I use to train the ML model, see data_for_ml.ipynb.
The parser that features prominently in this notebook is "parse_kraken_output.py"
The percent_plots directory has density plots of the percentage of classified kmers in kraken2 and krakenuniq results files.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published