Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update gtgseq dataset publications #2332

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion datasets/gtgseq.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Name: Garvan Institute Long Read Sequencing Benchmark Data
Description: "The dataset contains reference samples that will be useful for benchmarking and comparing bioinformatics tools for genome analysis. Currently, there are two samples, which are NA12878 (HG001) and NA24385 (HG002), sequenced on an Oxford Nanopore Technologies (ONT) PromethION using the latest R10.4.1 flowcells. Raw signal data output by the sequencer is provided for these datasets in BLOW5 format, and can be rebasecalled when basecalling software updates bring accuracy and feature improvements over the years. Raw signal data is not only for rebasecalling, but also can be used for emerging bioinformatics tools that directly analyse raw signal data. We also provide the basecalled data alongside the raw signal data and will continue to provide updated basecalls when there is a major update to the basecalling software. In the future, we plan to extend this open dataset with additional samples, including sequencing runs from vendors other than ONT."
Description: "The dataset contains reference samples that will be useful for benchmarking and comparing bioinformatics tools for genome analysis. Examples include: NA12878 (HG001) and NA24385 (HG002) sequenced on an Oxford Nanpopore Technologies (ONT) PromethION using the latest R10.4.1 flowcells; and, UHR RNA (direct-RNA) on an ONT PromethION using the latest RNA004 flowcells. Raw signal data output by the sequencer is provided for these datasets in BLOW5 format, and can be rebasecalled when basecalling software updates bring accuracy and feature improvements over the years. Raw signal data is not only for rebasecalling, but also can be used for emerging bioinformatics tools that directly analyse raw signal data. We also provide the basecalled data alongside the raw signal data."
Documentation: https://github.com/GenTechGp/gtgseq
Contact: "[gtgseq team](https://github.com/GenTechGp/gtgseq/issues)"
ManagedBy: "Genomic Technologies Group, Garvan Institute of Medical Research (https://www.garvan.org.au/research/labs-groups/genomic-technologies-lab)"
Expand Down Expand Up @@ -44,6 +44,9 @@ DataAtWork:
- Title: Fast nanopore sequencing data analysis with SLOW5.
URL: https://doi.org/10.1038/s41587-021-01147-4
AuthorName: Gamaarachchi, H., Samarakoon, H., Jenner, S.P. et al.
- Title: Streamlining remote nanopore data access with slow5curl
URL: https://doi.org/10.1093/gigascience/giae016
AuthorName: Wong, B., Ferguson, J.M., Do, J.Y. et al.
- Title: Flexible and efficient handling of nanopore sequencing signal data with slow5tools.
URL: https://doi.org/10.1186/s13059-023-02910-3
AuthorName: Samarakoon, H., Ferguson, J.M., Jenner, S.P. et al.
Expand Down