Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new scripts to ASCC for 0.5.0 #56

Closed
DLBPointon opened this issue Aug 7, 2024 · 0 comments · Fixed by #48
Closed

Add new scripts to ASCC for 0.5.0 #56

DLBPointon opened this issue Aug 7, 2024 · 0 comments · Fixed by #48
Labels
enhancement New feature or request
Milestone

Comments

@DLBPointon
Copy link
Contributor

Description of feature

From Eerik
There are two scripts in the old non-nf-core ASCC repository that seem to be missing from the sanger-tol/ascc version:
https://github.com/sanger-tol/cobiontcheck/blob/compleasm2_fcs_test/filter_fasta_by_length.py
This is for optionally filtering the input assembly to remove sequences longer than a certain length. This is run before anything else is done with the assembly. The purpose of this is to prevent the pipeline choking on huge FASTA sequences (it was Jo Wood's idea to just leave huge sequences out from runs). James Torrance is doing his runs so that sequences longer than 1.9 Gb are left out from runs. FCS-GX has been hardcoded to not work with sequences longer than 1.9 Gb anyway
https://github.com/sanger-tol/cobiontcheck/blob/compleasm2_fcs_test/find_taxid_in_taxdump.py
This is for checking if the taxID given by the user exists in the NCBI taxdump file. This script is also run at the start of the pipeline run. The taxID may be missing from taxdump either because the taxdump is out of date or the user has provided a faulty taxID number. The check at the start of the run is to catch the error early.
So I think these two scripts should be included in the sanger-tol/ascc pipeline. I can try to make Nextflow modules of them myself if you're working on other things (edited)

These should be relatively easy additions

@DLBPointon DLBPointon added the enhancement New feature or request label Aug 7, 2024
@DLBPointon DLBPointon added this to the Release 1 milestone Aug 7, 2024
DLBPointon added a commit that referenced this issue Aug 8, 2024
DLBPointon added a commit that referenced this issue Aug 8, 2024
@DLBPointon DLBPointon linked a pull request Aug 9, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant