Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consensus_qc task not compatible with lowercase seqs in FASTAs #149

Open
kapsakcj opened this issue May 13, 2022 · 0 comments
Open

consensus_qc task not compatible with lowercase seqs in FASTAs #149

kapsakcj opened this issue May 13, 2022 · 0 comments

Comments

@kapsakcj
Copy link
Contributor

kapsakcj commented May 13, 2022

When you have a nucleotide fasta that is all lowercase:

>fasta-header
atacgagcgcgagcgcgcacnnnacncnancgcgattg.......

The task & workflow (TheiaCov_FASTA and others) completes successfully, but results in all 0s for output metrics: numberN, number ATCG, number_Degenerate, number_Total, percent_reference_coverage

Adding the -i case insensitive flag with grep -i [...] should resolve this and/or perhaps add other lower case letters in grep -E "C|c|A|a|T|t|G|g"

command <<<
# capture date and version
date | tee DATE
num_N=$( grep -v ">" ~{assembly_fasta} | grep -o 'N' | wc -l )
if [ -z "$num_N" ] ; then num_N="0" ; fi
echo $num_N | tee NUM_N
num_ACTG=$( grep -v ">" ~{assembly_fasta} | grep -o -E "C|A|T|G" | wc -l )
if [ -z "$num_ACTG" ] ; then num_ACTG="0" ; fi
echo $num_ACTG | tee NUM_ACTG
# calculate percent coverage (Wu Han-1 genome length: 29903bp)
python3 -c "print ( round( ($num_ACTG / 29903 ) * 100, 2 ) )" | tee PERCENT_REF_COVERAGE
num_degenerate=$( grep -v ">" ~{assembly_fasta} | grep -o -E "B|D|E|F|H|I|J|K|L|M|O|P|Q|R|S|U|V|W|X|Y|Z" | wc -l )
if [ -z "$num_degenerate" ] ; then num_degenerate="0" ; fi
echo $num_degenerate | tee NUM_DEGENERATE
num_total=$( grep -v ">" ~{assembly_fasta} | grep -o -E '[A-Z]' | wc -l )
if [ -z "$num_total" ] ; then num_total="0" ; fi
echo $num_total | tee NUM_TOTAL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant