[Merlin_magic; Sonneityping] harden workflow against failures due to sonnei typing disagreement #747
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR closes #600
🗑️ This dev branch should be deleted after merging to main.
🧠 Summary
Samples that would return percent_coverage < 90 would not produce indices that were required from the parser parse_mykrobe_predict.py housed within the mykrobe container. This would lead to mislabeled or dubiously characterized samples failing at task_sonneityping. Commit
7d18a7c
ofsonneityping
committed by Sage solves this issue along with a typo correction. This PR implements this commit into a new docker container, us-docker.pkg.dev/general-theiagen/staphb/mykrobe:0.12.1-parser-7d18a7c along with error handling to still deal with the possibility of missing output from mykrobe. I do not foresee this occurring with the new fix to the parser in place as it will report at "not sonnei" but I have kept the error handling present.⚡ Impacted Workflows/Tasks
task_sonneityping.wdl
wf_merlin_magic.wdl
TheiaProk_PE
TheiaProk_SE
TheiaProk_ONT
This PR may lead to different results in pre-existing outputs: No
Samples already with a classification of Sonnei from task_sonneityping will not be changed, only samples that would fail will now have a classification of "not sonnei" returned.
This PR uses an element that could cause duplicate runs to have different results: No
🛠️ Changes
Docker image changed to updated parser version.
Error Handling:
A check added to check for sonneityping_predictResults.tsv with debug messages hinting at if the file was found or not.
If file is not found the necessary output text files are returned as empty, per the slack discussion, and the python code block is skipped.
File outputs have been made optional to also handle the event of a missing output.
⚙️ Algorithm
Docker image changed to pull newest commit of parse_mykrobe_predict.py
➡️ Inputs
us-docker.pkg.dev/general-theiagen/staphb/mykrobe:0.12.1 -> us-docker.pkg.dev/general-theiagen/staphb/mykrobe:0.12.1-parser-7d18a7c
⬅️ Outputs
File outputs are now optional outputs.
🧪 Testing
Each of the affected workflows was tested using non sonnei and sonnei sequences.
TheiaProk_PE
TheiaProk_SE
TheiaProk_ONT
TheiaProk_PE; Group of samples cited as issues in #600
Instead of failing, samples display "not sonnei" in sonneityper output.
Suggested Scenarios for Reviewer to Test
🔬 Final Developer Checklist
workflows_overview
tables to be the tag for the next upcoming release. If you do not know the tag, please put "vX.X.X"🎯 Reviewer Checklist