Skip to content

natir/rustyread

Folders and files

NameName
Last commit message
Last commit date

Latest commit

5b55d70 · Oct 7, 2022
Oct 7, 2022
May 3, 2021
May 21, 2021
Apr 23, 2021
Oct 7, 2022
May 4, 2021
Apr 11, 2021
Oct 7, 2022
Mar 19, 2021
Oct 7, 2022
Oct 7, 2022

Repository files navigation

Test Lints MSRV CodeCov Documentation License

Rustyread

Rustyread is a drop in replacement of badread simulate. Rustyread is very heavily inspired by badread, it reuses the same error and quality model file. But Rustyreads is multi-threaded and benefits from other optimizations.

WARNING:

  • Rustyread has not yet been evaluated or even compared to any other long read generators
  • Rustyread is tested only on Linux
  • Rustyread is still in developpement many thing can change or be break

Usage

If previously you called badread like this:

badread simulate --reference {reference path} --quantity {quantity} > {reads}.fastq

you can now replace badread by rustyread:

rustyread simulate --reference {reference path} --quantity {quantity} > {reads}.fastq

But by default rustyread use all avaible core you can control it with option threads:

rustyread --theads {number of thread} simulate --reference {reference path} --quantity {quantity} > {reads}.fastq

If you have badread installed in your python sys.path rustyread can found error and quality model automatically, but you can still use --error_model and --qscore_model option.

Control memory usage

Rustyread memory usage could be estimated with formula: 2 * reference base + 2 * targeted base + epsilon, to limit memory impact of Rustyread you can use parameter number_base_store it's take an absolute value or a relative depth, if this option is set memory usage became 2 * reference base + 2 number_base_store + epsilon.

Full usage

rustyread 0.4.1 Machamp
Pierre Marijon <[email protected]>
A long read simulator based on badread idea and model

USAGE:
    rustyread [OPTIONS] <SUBCOMMAND>

OPTIONS:
    -h, --help                 Print help information
    -t, --threads <THREADS>    Number of thread use by rustyread, 0 use all avaible core, default
                               value 0
    -v, --verbosity            Verbosity level also control by environment variable RUSTYREAD_LOG if
                               flag is set RUSTYREAD_LOG value is ignored
    -V, --version              Print version information

SUBCOMMANDS:
    help        Print this message or the help of the given subcommand(s)
    simulate    Generate fake long read
rustyread-simulate
Generate fake long read

USAGE:
    rustyread simulate [FLAGS] [OPTIONS] --reference <reference-path> --quantity <quantity>

FLAGS:
    -h, --help                  Prints help information
        --small_plasmid_bias    If set, then small circular plasmids are lost when the fragment
                                length is too high (default: small plasmids are included regardless
                                of fragment length)
    -V, --version               Prints version information

OPTIONS:
        --chimera <chimera>
            Percentage at which separate fragments join together [default: 1]

        --end_adapter <end-adapter>
            Adapter parameters for read ends (rate and amount) [default: 50,20]

        --end_adapter_seq <end-adapter-seq>
            Adapter parameters for read ends [default: GCAATACGTAACTGAACGAAGT]

        --error_model <error-model>
            Path to an error model file [default: nanopore2020]

        --glitches <glitches>
            Read glitch parameters (rate, size and skip) [default: 10000,25,25]

        --identity <identity>
            Sequencing identity distribution (mean, max and stdev) [default: 85,95,5]

        --junk_reads <junk>
            This percentage of reads wil be low complexity junk [default: 1]

        --length <length>
            Fragment length distribution (mean and stdev) [default: 15000,13000]

        --number_base_store <nb-base-store>
            Number of base, rustyread can store in ram before write in output in absolute value
            (e.g. 250M) or a relative depth (e.g. 25x)

        --output <output-path>                     Where read is write
        --qscore_model <qscore-model>
            Path to an quality score model file [default: nanopore2020]

        --quantity <quantity>
            Either an absolute value (e.g. 250M) or a relative depth (e.g. 25x)

        --random_reads <random>
            This percentage of reads wil be random sequence [default: 1]

        --reference <reference-path>               Reference fasta (can be gzipped, bzip2ped, xzped)
        --seed <seed>
            Random number generator seed for deterministic output (default: different output each
            time)

        --start_adapter <start-adapter>
            Adapter parameters for read starts (rate and amount) [default: 90,60]

        --start_adapter_seq <start-adapter-seq>
            Adapter parameters for read starts [default: AATGTACTTCGTTCAGTTACGTATTGCT]

Installation

Bioconda

If you haven't bioconda setup follow this instruction

conda|mamba install rustyread

With rust environment

If you haven't a rust environment you can use rustup or your package manager.

With cargo

cargo install --git https://github.com/natir/rustyread.git --tag 0.4.1

From source

git clone https://github.com/natir/rustyread.git
cd rustyread
git checkout 0.4.1
cargo install --path .

Minimum supported Rust version

Currently the minimum supported Rust version is 1.57.0.

Difference with badread

  • option small_plasmid_bias is silently ignored but small plasmid is 'sequence'