Skip to content

Latest commit

 

History

History
75 lines (51 loc) · 2.16 KB

README.md

File metadata and controls

75 lines (51 loc) · 2.16 KB

DNA Sequence Analysis in Rust

Introduction

This Rust project aims to provide a robust and efficient solution for basic DNA sequence analysis, including sequence alignments and mutation detection. Leveraging the power of Rust's performance and safety features.

Features

  1. Pairwise Sequence Alignment: Utilize the Smith Waterman algorithm for global, semiglobal, and local pairwise sequence alignments.

  2. Partial Order Alignment (POA): It is particularly useful when dealing with sequences that exhibit significant variations or insertions and deletions (indels). POA extends traditional pairwise alignment methods to handle multiple sequences simultaneously.

  3. Mutation Detection: Detect mutations in DNA sequences, allowing for the identification of variations and differences between sequences.

Getting Started

Prerequisites

  1. Rust: Ensure that Rust is installed on your system. If not, you can install it from https://www.rust-lang.org/tools/install.

Installation

  1. Clone the repository:
git clone https://github.com/rustdelhi/DNASequenceAnalysis.git
  1. Navigate to the project directory:
cd DNASequenceAnalysis
  1. Build the project:
cargo build --release

Usage

  1. Run the project with 2 FASTA files that contains only 1 sequence each (Example files included in assets folder):
cargo run --release -- --reference ./assets/SARS-beta.fasta --query ./assets/SARS-delta.fasta --print

Example output:

Score:
+---------+------------+--------------+------------+-----------+-------+
| match   | miss_match | substitution | insertions | deletions | total |
+---------+------------+--------------+------------+-----------+-------+
| 29707   | 156        | 85           | 13         | 58        | 30019 |
+---------+------------+--------------+------------+-----------+-------+
time taken: 11.037773246s

Run cargo run --release -- --help to know more about CLI usage

Citations

  1. Sequence Alignmnt
  2. Smith Waterm Algorithm
  3. NCBI