Skip to content

Commit

Permalink
Merge pull request #10 from RabbitBio/xiaoming/add_kssd_dist_operation
Browse files Browse the repository at this point in the history
add the Kssd sketch strategy for clust-mst by '--fast' option
  • Loading branch information
XiaomingXu1995 authored Dec 17, 2023
2 parents a22cd79 + 0d408e0 commit 15b57b0
Show file tree
Hide file tree
Showing 14 changed files with 2,149 additions and 190 deletions.
9 changes: 7 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
![RabbitTClust](rabbittclust.png)

# `RabbitTClust v.2.2.1`
# `RabbitTClust v.2.3.0`
RabbitTClust is a fast and memory-efficient genome clustering tool based on sketch-based distance estimations.
It enables processing of large-scale datasets by combining dimensionality reduction techniques with streaming and parallelization on modern multi-core platforms.
RabbitTClust supports classical single-linkage hierarchical (clust-mst) and greedy incremental clustering (clust-greedy) algorithms for different scenarios.

## Installation
`RabbitTClust v.2.2.1` can only support 64-bit Linux Systems.
`RabbitTClust v.2.3.0` can only support 64-bit Linux Systems.

The detailed update information for this version, as well as the version history, can be found in the [`version_history`](version_history/history.md) document.

Expand Down Expand Up @@ -41,6 +41,7 @@ Options:
--presketched TEXT clustering by the pre-generated sketch files rather than genomes
--premsted TEXT clustering by the pre-generated mst files rather than genomes for clust-mst
--newick-tree output the newick tree format file for clust-mst
--fast use the kssd algorithm for sketching and distance computing for clust-mst
--append TEXT Excludes: --input
append genome file or file list with the pre-generated sketch or MST files

Expand Down Expand Up @@ -103,6 +104,10 @@ Options:
# v.2.2.1 or later
# output the newick tree format for clust-mst, use the --newick-tree flag.
./clust-mst -l -i bacteria.list --newick-tree -o bacteria.mst.clust

# v.2.3.0 or later
# use the efficient Kssd sketch strategy for clust-mst, use the --fast flag.
./clust-mst --fast -l -i bacteria.list -o bacteria.fast.mst.clust
```
## Output
The output file is in a CD-HIT output format and is slightly different when running with or without `-l` input option.
Expand Down
Loading

0 comments on commit 15b57b0

Please sign in to comment.