Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add alternative long-read (nanopore) preprocessing tools #145

Closed
sofstam opened this issue Oct 13, 2022 · 17 comments
Closed

Add alternative long-read (nanopore) preprocessing tools #145

sofstam opened this issue Oct 13, 2022 · 17 comments
Assignees
Labels
enhancement Improvement for existing functionality

Comments

@sofstam
Copy link
Collaborator

sofstam commented Oct 13, 2022

Description of feature

Since Porechop is no longer supported, it is maybe useful to investigate Porechop_ABI as alternative.

Or https://github.com/epi2me-labs/pychopper

@sofstam sofstam added the enhancement Improvement for existing functionality label Oct 13, 2022
@jfy133
Copy link
Member

jfy133 commented Oct 14, 2022

Would need to investigate putting it on conda...

@sofstam
Copy link
Collaborator Author

sofstam commented Oct 17, 2022

https://anaconda.org/bioconda/porechop_abi

There is a conda environment, I think that this can be added in the next release or what do you think?

@jfy133
Copy link
Member

jfy133 commented Oct 17, 2022

Oh my bad, I misunderstood their installation instructions 🤦‍♀️

This is fine for inclusion for first release while we wait for final Bracken/KrakenUniq!

@jfy133 jfy133 modified the milestones: 1.1, First Release Oct 17, 2022
@sofstam
Copy link
Collaborator Author

sofstam commented Nov 1, 2022

@sofstam
Copy link
Collaborator Author

sofstam commented Nov 1, 2022

I was thinking if we should only support porechop_abi and drop porechop at all.

@jfy133
Copy link
Member

jfy133 commented Nov 1, 2022

Is it equivalent? I don't have any feeling either way as I don't use the data, so I'm happy let you make the call :)

@sofstam
Copy link
Collaborator Author

sofstam commented Nov 1, 2022

According to their github page: Note that Porechop_ABI is not designed to handle barcoded sequences adapters. Demultiplexing should be done using standard Porechop commands or other appropriate tools. It is not equivalent as it does not perform demultiplexing. However, demultiplexing is supported by Guppy and it is currenly preferred. @Midnighter have you worked with Nanopore data?

@jfy133
Copy link
Member

jfy133 commented Nov 1, 2022

that's fine, we don't suppport demultiplxiing either

@qbonenfant
Copy link

qbonenfant commented Nov 1, 2022

Hi, i'm the main developper of porechop_abi.
If you have any question on the project, feel free to ask me directly, i will be glad to answer.

Regarding the "equivalent or not" part:

We "only" added a step between the adapter database object creation and adapter ressearch in the reads.
porechop_abi

The code base of porechop was left as original as possible, and all commands that used to run on porechop are unchanged. The behaviors of the two softwares are identical, as long as you don't use -abi or -go options.
What porechop can do, porechop_abi can too.
The part we can't handle (at least for now) is inferring barcoded adapters sequences from the reads only (hence the quoted sentence in your previous comment).

What could make a difference is the name of the executable.
We had to change it to avoid installation conflicts and it may need to be changed in pipelines if you want to use our version.

TL;DR: It can work the same, but the name is different.

On the Demultiplexing part:
Using porechop is prety much obsolete for demultiplexing.
Even the dedicated tool (Deepbinner )developped by Ryan Wick (original author of porechop) is now deemed too old.
Guppy (Nanopore basecaller) seems to be the "current standard" for demultiplexing.

@Midnighter
Copy link
Collaborator

However, demultiplexing is supported by Guppy and it is currenly preferred. @Midnighter have you worked with Nanopore data?

I've only worked with FASTQ files that are the result of running guppy so far. I agree with @jfy133 that we should not include demultiplexing in this pipeline. We don't do it for short reads either.

@sofstam
Copy link
Collaborator Author

sofstam commented Nov 1, 2022

I've only worked with FASTQ files that are the result of running guppy so far. I agree with @jfy133 that we should not include demultiplexing in this pipeline. We don't do it for short reads either.

I agree, I was just trying to list the differences between porechop and porechop_abi , not adding demultiplexing steps.

@sofstam
Copy link
Collaborator Author

sofstam commented Nov 1, 2022

@qbonenfant Thank you for the detailed description 👍

@jfy133 jfy133 modified the milestones: First Release, 1.1 Nov 22, 2022
@LilyAnderssonLee LilyAnderssonLee self-assigned this Mar 20, 2023
@github-project-automation github-project-automation bot moved this to Todo (Hackathon General) in nf-core Hackathon March 2023 Mar 23, 2023
@jfy133
Copy link
Member

jfy133 commented Apr 20, 2023

@jfy133
Copy link
Member

jfy133 commented Apr 20, 2023

@jfy133 jfy133 changed the title Alternative to Porechop Add alternative long-read (nanopore) preprocessing tools Apr 20, 2023
@LilyAnderssonLee
Copy link
Contributor

It seems that Pychopper supports ONT long-read sequencing, whereas chopper is compatible with both PacBio and ONT sequences. Therefore, it might be more advantageous to utilize chopper in this context.

@sofstam
Copy link
Collaborator Author

sofstam commented Nov 22, 2023

Pychopper is for cDNA reads so not useful for us.

I will have to test first but to my understanding, we can use porechop_abi as optional step for adapter trimming and drop porechop.
Regarding chopper, it can be added as an alternative to filtlong.

What do you think?

@LilyAnderssonLee
Copy link
Contributor

So I am going to add porechop_abi to taxprofiler now as an alternative tool for adapter trimming of long reads

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improvement for existing functionality
Projects
No open projects
Development

No branches or pull requests

5 participants