Skip to content

Commit

Permalink
Merge branch 'main' into cjk-assembly-fetch
Browse files Browse the repository at this point in the history
  • Loading branch information
Michal-Babins authored Feb 4, 2025
2 parents 9036950 + 5a7c9ab commit d138b19
Show file tree
Hide file tree
Showing 80 changed files with 1,818 additions and 1,048 deletions.
5 changes: 5 additions & 0 deletions .dockstore.yml
Original file line number Diff line number Diff line change
Expand Up @@ -291,5 +291,10 @@ workflows:
- name: Concatenate_Illumina_Lanes_PHB
subclass: WDL
primaryDescriptorPath: /workflows/utilities/file_handling/wf_concatenate_illumina_lanes.wdl
testParameterFiles:
- /tests/inputs/empty.json
- name: Clair3_Variants_ONT_PHB
subclass: WDL
primaryDescriptorPath: /workflows/standalone_modules/wf_clair3_variants_ont.wdl
testParameterFiles:
- /tests/inputs/empty.json
2 changes: 1 addition & 1 deletion .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ This PR uses an element that could cause duplicate runs to have different result
- [ ] The CI/CD has been adjusted and tests are passing (Theiagen developers)
- [ ] Code changes follow the [style guide](https://theiagen.notion.site/Style-Guide-WDL-Workflow-Development-51b66a47dde54c798f35d673fff80249)
- [ ] Documentation and/or workflow diagrams have been updated if applicable
- [ ] You have updated the latest version for any affected worklows in the respective workflow documentation page and for every entry in the three `workflows_overview` tables.
- [ ] You have updated the "Last Known Changes" field for any affected workflows in the respective workflow documentation page and for every entry in the three `workflows_overview` tables to be the tag for the next upcoming release. If you do not know the tag, please put "vX.X.X"

## 🎯 Reviewer Checklist
<!-- Indicate NA when not applicable -->
Expand Down
5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ If you would like to provide feedback, please raise a [GitHub issue](https://git

### Contributing to the PHB workflows

We warmly welcome contributions to this repository! Our style guide may be found [here](https://theiagen.notion.site/Style-Guide-WDL-Workflow-Development-51b66a47dde54c798f35d673fff80249) for convenience of formatting.
We warmly welcome contributions to this repository! Our code style guide may be found [here](https://theiagen.github.io/public_health_bioinformatics/latest/contributing/code_contribution/) for convenience of formatting and our documentation style guide may be found [here](https://theiagen.github.io/public_health_bioinformatics/latest/contributing/doc_contribution/)

You can expect a careful review of every PR and feedback as needed before merging, just like we do for PRs submitted by the Theiagen team. Our PR template can help prepare you for the review process. As always, reach out with any questions! We love recieving feedback and contributions from the community. When your PR is merged, we'll add your name to the contributors list below!

Expand All @@ -55,6 +55,9 @@ You can expect a careful review of every PR and feedback as needed before mergin
* **Michal Babinski** ([@Michal-Babins](https://github.com/Michal-Babins)) - Software, Validation
* **Andrew Lang** ([@AndrewLangVt](https://github.com/AndrewLangVt)) - Software, Supervision
* **Kelsey Kropp** ([@kelseykropp](https://github.com/kelseykropp)) - Validation
* **Theron James** ([@MrTheronJ](https://github.com/MrTheronJ)) - Software, Validation
* **Andrew Hale** ([@awh082834](https://github.com/awh082834)) - Software, Validation
* **Zachary Konkel** ([@xonq](https://github.com/xonq)) - Software, Validation
* **Joel Sevinsky** ([@sevinsky](https://github.com/sevinsky)) - Conceptualization, Project Administration, Supervision

### External Contributors
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed docs/assets/figures/Freyja_FASTQ.png
Binary file not shown.
Binary file added docs/assets/figures/Freyja_Suite.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/figures/Freyja_figure2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/figures/Freyja_figure3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/figures/Freyja_figure4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/figures/TheiaMeta_Illumina_PE.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/figures/example_krona_report.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 6 additions & 0 deletions docs/assets/files/TheiaCoV_Illumina_PE_qc_check_template.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
taxon num_reads_raw1 num_reads_raw2 num_reads_clean1 num_reads_clean2 kraken_human kraken_human_dehosted meanbaseq_trim assembly_mean_coverage number_N number_Degenerate assembly_length_unambiguous_min assembly_length_unambiguous_max percent_reference_coverage vadr_num_alerts
sars-cov-2 100000 100000 100000 100000 20 20 30 100 5000 1 25000 30000 83 0
HIV 100000 100000 100000 100000 20 20 30 100
WNV 100000 100000 100000 100000 20 20 30 100
MPXV 100000 100000 100000 100000 20 20 30 100
flu 100000 100000 100000 100000 20 20 30 100
Binary file not shown.
3 changes: 3 additions & 0 deletions docs/contributing/code_contribution.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@ Style guide inspired by Scott Frazer’s [WDL Best Practices Style Guide](http

## General Guidelines

!!! tip "Please ensure your code adheres to our philosophy of failures"
At Theiagen, we believe our workflows should only fail because of technical issues, not because of poor quality data. Our goal is to create workflows that can handle data in any condition and still provide meaningful results, especially if that data isn’t perfect. For more information, see our [Workflow Failure Philosophy](../getting_started/philosophy.md).

***Modularity and Metadata***

- **Best Practice:** Place tasks and workflows in separate files to maintain modularity and clarity.
Expand Down
35 changes: 35 additions & 0 deletions docs/getting_started/philosophy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Workflow Failure Philosophy

## Our Approach to Workflow Failures

At Theiagen, **we believe our workflows should only fail because of technical issues, not because of poor quality data**. Our goal is to create workflows that can handle data in any condition and still provide meaningful results, _especially_ if that data isn’t perfect.

### What You Can Expect

- **Your workflow will keep running, even with imperfect data**

_**Data quality shouldn't cause failures**_. Poor or incomplete data should never stop your workflow. Instead, the workflow will process it and (hopefully!) provide meaningful outputs, which can include blank results or messages indicating the value could not be generated.

- **Your workflow will provide useful feedback, not errors**

If an issue arises in your data, such as missing or invalid data in a template control, you can know that any _**workflow failures are due to underlying programmatic issues**_, not your data.

- **You’ll gain a better understanding of your data**

Since poor-quality data will not cause workflow failures, the _**relevant QC results will be available as output**_, so you can understand what's happened and make any needed adjustments moving forward.

### Ongoing Improvements

While we’ve made a lot of progress, we’re still working on fully implementing this philosophy across all of our workflows. If you encounter an issue where poor data quality leads to a failure, please let us know. Your feedback helps us make continuous improvements.

---

**Thanks for being part of the process!** We’re always working to improve and your feedback plays a huge role in making that happen. Together, we’ll keep making things run smoother and easier for everyone.

If you experience a workflow failure related to data quality, we want to hear from you! Please reach out to us at <[email protected]> with the following details:

- The type of data involved
- The error messages or failures encountered
- The steps that led to the issue
- if this error was generated on the command-line, please include the full command used
- if this error was generated when running the workflow with Terra.bio, please provide a link to the specific workflow's job history page
5 changes: 4 additions & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ We continuously work to improve our codebase and usability of our workflows by t

## Contributing to the PHB Repository

We warmly welcome contributions to this repository! Our style guide may be found [here](contributing/code_contribution.md) for convenience of formatting.
We warmly welcome contributions to this repository! Our code style guide may be found [here](contributing/code_contribution.md) for convenience of formatting and our documentation style guide may be found [here](contributing/doc_contribution.md).

If you would like to submit suggested code changes to our workflows, you may add or modify the WDL files and submit pull requests to the [PHB GitHub](https://github.com/theiagen/public_health_bioinformatics) repository.

Expand All @@ -73,6 +73,9 @@ You can expect a careful review of every PR and feedback as needed before mergin
- **Michal Babinski** ([@Michal-Babins](https://github.com/Michal-Babins)) - Software, Validation
- **Andrew Lang** ([@AndrewLangVt](https://github.com/AndrewLangVt)) - Software, Supervision
- **Kelsey Kropp** ([@kelseykropp](https://github.com/kelseykropp)) - Validation
- **Theron James** ([@MrTheronJ](https://github.com/MrTheronJ)) - Software, Validation
- **Andrew Hale** ([@awh082834](https://github.com/awh082834)) - Software, Validation
- **Zachary Konkel** ([@xonq](https://github.com/xonq)) - Software, Validation
- **Joel Sevinsky** ([@sevinsky](https://github.com/sevinsky)) - Conceptualization, Project Administration, Supervision

### External Contributors
Expand Down
2 changes: 2 additions & 0 deletions docs/stylesheets/extra.css
Original file line number Diff line number Diff line change
Expand Up @@ -173,6 +173,7 @@
table {
overflow-y: scroll;
max-height: 500px;
max-width: 100vw;
display: block;
}
th {
Expand All @@ -183,6 +184,7 @@ th {
}
td {
word-break: break-all;
overflow-wrap: anywhere;
}
/* Base styles for the search box */
div.searchable-table input.table-search-input {
Expand Down
4 changes: 4 additions & 0 deletions docs/workflows/data_import/sra_fetch.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@ This workflow runs on the sample level.
| fetch_sra_to_fastq | **docker_image** | String | The Docker container to use for the task | "us-docker.pkg.dev/general-theiagen/biocontainers/fastq-dl:2.0.4--pyhdfd78af_0" | Optional |
| fetch_sra_to_fastq | **fastq_dl_options** | String | Additional parameters to pass to fastq_dl from [here](https://github.com/rpetit3/fastq-dl?tab=readme-ov-file#usage) | "--provider sra" | Optional |
| fetch_sra_to_fastq | **memory** | Int | Amount of memory/RAM (in GB) to allocate to the task | 8 | Optional |
| version_capture | **docker** | String | The Docker container to use for the task | "us-docker.pkg.dev/general-theiagen/theiagen/alpine-plus-bash:3.20.0" | Optional |
| version_capture | **timezone** | String | Set the time zone to get an accurate date of analysis (uses UTC by default) | | Optional |

</div>

Expand All @@ -49,6 +51,8 @@ Given the lack of usefulness of SRA Lite formatted FASTQ files, we try to avoid

| **Variable** | **Type** | **Description** | **Production Status** |
|---|---|---|---|
| sra_fetch_version | String | The version of the repository SRA_Fetch is run in | Always produced |
| sra_fetch_analysis_date | String | Date of SRA_Fetch download | Always produced |
| read1 | File | File containing the forward reads | Always produced |
| read2 | File | File containing the reverse reads (not availablae for single-end or ONT data) | Produced only for paired-end data |
| fastq_dl_date | String | The date of download | Always produced |
Expand Down
Loading

0 comments on commit d138b19

Please sign in to comment.