Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing host_genome using FASTQ sample from MGnify #2

Open
marcoreverenna opened this issue Mar 21, 2024 · 1 comment
Open

Removing host_genome using FASTQ sample from MGnify #2

marcoreverenna opened this issue Mar 21, 2024 · 1 comment

Comments

@marcoreverenna
Copy link
Collaborator

marcoreverenna commented Mar 21, 2024

The following command line has been used to run the pipeline:
nextflow run main.nf -profile az_test -w az://orange -ansi-log false -resume -with-dag dag.png


This error message occurs after removing the the path host_genome in the process QC and index_ch in the workflow.

This error message occurs after removing the --reference-db host_genome in kneaddata.
Seemed the pipeline was running correctly, it did not break since the beginning.

Command executed:
kneaddata -i1 ERR1713346_1.fastq.gz -i2 ERR1713346_2.fastq.gz --threads 8 --output . --bypass-trim
  
mkdir -p kneaddata_logs
mv ERR1713346_1_kneaddata.log kneaddata_logs/

Command error:
ERROR: Unable to write file: /mnt/batch/tasks/workitems/job-101f51bdea810a457fef-QC/job-1/nf-02b3c0b0a2d436eb29a216b10ec57dd0/wd/reformatted_identifierskxgtnfmc_decompressed_7533av6e_ERR1713346_1
@apalleja
Copy link
Collaborator

Hi Marco,

I think the problem is caused by the space in the reads header:
@ERR1713338.1 J00138:63:HCNWCBBXX:1:1101:3772:1103/1

This may cause that the identifiers can not be reformatted correctly. A quick fix is replacing the space by an underscore on the header; e.g sed s/\ /_/g ERR1713338_1.fastq > ERR1713338fixed_1.fastq

Perhaps a long-term solution is creating a module to substitute the space or perhaps substituting Kneaddata by other software where we have more flexibility and can separate the tasks (adapter removal, trimming, host removal, ...). Thinking about ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants