Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No SSU rRNA sequences found in trusted contigs by Barrnap #192

Open
AbigailJTH opened this issue Jul 1, 2024 · 5 comments
Open

No SSU rRNA sequences found in trusted contigs by Barrnap #192

AbigailJTH opened this issue Jul 1, 2024 · 5 comments

Comments

@AbigailJTH
Copy link

AbigailJTH commented Jul 1, 2024

Hi, I was trying to assemble some endolithic green algae chloroplast 16S rRNA from a coral metatranscriptome. I have the database for multiple strains and I prepared the database following the instructions on https://hrgv.github.io/phyloFlash/install.html 4.3. Set up a custom database with your own sequences.

However, the following errors happened.
[09:20:27] Extracting SSU rRNA from trusted contigs
/data3/Meta_Os_rRNA/SILVA_OsDB.fasta...
[09:20:27] running subcommand:
/root/miniconda3/envs/phyloflash/lib/phyloFlash/barrnap-HGV/bin/barrnap_HGV
--evalue 1e-100 --reject 0.6 --kingdom bac --gene ssu --threads
20 /data3/Meta_Os_rRNA/SILVA_OsDB.fasta
>G5-176-C2-T0-OsA-LFK11691_L2_paired_almost_everything.trusted.bac.gff
2>G5-176-C2-T0-OsA-LFK11691_L2_paired_almost_everything.trusted.barrnap.log
[09:20:27] running subcommand:
/root/miniconda3/envs/phyloflash/lib/phyloFlash/barrnap-HGV/bin/barrnap_HGV
--evalue 1e-100 --reject 0.6 --kingdom arch --gene ssu
--threads 20 /data3/Meta_Os_rRNA/SILVA_OsDB.fasta
>G5-176-C2-T0-OsA-LFK11691_L2_paired_almost_everything.trusted.arch.gff
2>G5-176-C2-T0-OsA-LFK11691_L2_paired_almost_everything.trusted.barrnap.log
[09:20:28] running subcommand:
/root/miniconda3/envs/phyloflash/lib/phyloFlash/barrnap-HGV/bin/barrnap_HGV
--evalue 1e-100 --reject 0.6 --kingdom euk --gene ssu --threads
20 /data3/Meta_Os_rRNA/SILVA_OsDB.fasta
>G5-176-C2-T0-OsA-LFK11691_L2_paired_almost_everything.trusted.euk.gff
2>G5-176-C2-T0-OsA-LFK11691_L2_paired_almost_everything.trusted.barrnap.log
[09:20:29] no SSU rRNA sequences found in trusted contigs by Barrnap
[09:20:29] mapping extracted SSU reads back on trusted SSU sequences
[09:20:29] running subcommand:
/root/miniconda3/envs/phyloflash/bin/bbmap.sh fast=t
minidentity=0.98 -Xmx20g threads=20 po=f outputunmapped=t
ref=G5-176-C2-T0-OsA-LFK11691_L2_paired_almost_everything.trusted.all.fasta
nodisk
in=G5-176-C2-T0-OsA-LFK11691_L2_paired_almost_everything.G5-176-C2-T0-OsA-LFK11691_L2_paired.rrna.1.fq.SSU.1.fq
out=G5-176-C2-T0-OsA-LFK11691_L2_paired_almost_everything.trusted.bbmap.sam
noheader=t overwrite=t
in2=G5-176-C2-T0-OsA-LFK11691_L2_paired_almost_everything.G5-176-C2-T0-OsA-LFK11691_L2_paired.rrna.1.fq.SSU.2.fq
pairlen=1200
outu=G5-176-C2-T0-OsA-LFK11691_L2_paired_almost_everything.trusted.bbmap.outu.fwd.fastq
outu2=G5-176-C2-T0-OsA-LFK11691_L2_paired_almost_everything.trusted.bbmap.outu.rev.fastq
2>G5-176-C2-T0-OsA-LFK11691_L2_paired_almost_everything.trusted.bbmap.out
[09:20:29] FATAL: Tool execution failed!.
Error was '' and return code '256'
Check error log file
G5-176-C2-T0-OsA-LFK11691_L2_paired_almost_everything.trusted.bbmap.out
Aborting.
[09:20:29] Saving log to file phyloFlash_log_on_error
Processing complete for folder: G5-176-C2-T0-OsA-LFK11691_L2_paired

My database sequences are quite short, like ~250bp. Is that the reason for the failure? Could you please give me some instructions?
Thanks a lot!

@kbseah
Copy link
Contributor

kbseah commented Jul 1, 2024

hello, thanks for your report. It looks like you tried to use the -trusted option, but that doesn't work when working with a custom database because the trusted contigs are screened with the default SSU rRNA models.

Could you please supply the full command line you used?

@AbigailJTH
Copy link
Author

Hi, thanks so much for your reply!

I was using a .sh script cuz I have a lot of samples.

#!/bin/bash
output_dir = "/data3/Meta_Os_rRNA/output_phyloflash"
for file in *rrna.1.fq.gz; do
echo "Processing sample: ${file}"

    # Get the folder name
    sample_name=$(basename "${file}" .rrna.1.fq.gz)
    # Set input file names
    echo  "Sample is : ${sample_name}"
    input_r1="${sample_name}.rrna.1.fq.gz"
    input_r2="${sample_name}.rrna.2.fq.gz"
    log="${sample_name}.log"
    # Run minimap2 to map reads to concatenated and indexed assembly for the current sample
 phyloFlash.pl -lib "${sample_name}_almost_everything" -read1 "${input_r1}" -read2 "${input_r2}" -almosteverything -CPUs 20 -readlength 145 -dbhome /data/software/SILVA_db/138.1/ -log  -zip -taxlevel 10 -readlimit 1000000 -trusted /data3/Meta_Os_rRNA/SILVA_OsDB_v2.fasta

echo "Processing complete for folder: ${sample_name}"
done

I am not sure if this matters but my 16S rRNA sequences are quite short.

@kbseah
Copy link
Contributor

kbseah commented Jul 1, 2024

Thanks for the details. Could you try running phyloFlash without the -trusted option?

The idea behind -trusted was to allow users to supply the full-length SSU rRNA sequences for organisms that were known to be in the libraries, so these can be mapped out before the remainder are assembled. This improves the assembly of lower-abundance SSU rRNA sequences in some cases. If you don't have full length sequences then it would not be useful here.

Hope that this helps

@AbigailJTH
Copy link
Author

Thanks for your suggestions. I tried it without the -trusted option and it worked.

My aim was to assemble some SSU rRNA sequences which are not in the SILVA database but unfortunately we don't have the full-length ones.

Thanks a lot!

@kbseah
Copy link
Contributor

kbseah commented Jul 2, 2024

You could consider trying the PR2 database, which does include plastid rRNA sequences, and adapt it for phyloFlash: https://pr2-database.org/

Good luck with your project!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants