Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

address adapter “confusion” #3

Open
3 of 11 tasks
sreichl opened this issue Aug 26, 2023 · 2 comments
Open
3 of 11 tasks

address adapter “confusion” #3

sreichl opened this issue Aug 26, 2023 · 2 comments
Assignees
Labels
documentation Improvements or additions to documentation

Comments

@sreichl
Copy link
Collaborator

sreichl commented Aug 26, 2023

  • discuss shortly the pipeline and its configuration with MS
  • double check BE's original pipeline
    • config → using the same?
    • logic/task/rule: does it decide between them at some point? Fastp command different?
  • run pipeline with all potential different settings
    • no adapter info
    • "original” nextera.fa file and config adapter sequence
    • "original” nextera.fa file w/o config adapter sequence
    • "corrected” nextera.fa file and config adapter sequence
    • "corrected” nextera.fa file w/o config adapter sequence
  • share all four MultiQC reports w/ all ATAC-seq users e.g., BSF, RtH, Team-Titan, MS
    • the observation how the fastp command currently trims adapters (sequential and redundant with the short adapter sequence in the config file)
    • the nextera.fa file seems to be incorrect/incomplete. Ask for opinions and/or meeting in what we should change.
      • 8 adapter sequences in the nextera.fa file seem to have one nucelotide too much
    • attach reports for all 4 different settings
      • no adapter info
      • old adapter info
      • new adapter file and sequence
      • only new adapter file
@sreichl sreichl self-assigned this Aug 26, 2023
@sreichl sreichl added the documentation Improvements or additions to documentation label Aug 26, 2023
@sreichl sreichl changed the title address adapter “confusion” (last when pipeline is leaner and faster) address adapter “confusion” Aug 26, 2023
@sreichl
Copy link
Collaborator Author

sreichl commented Nov 29, 2023

ATAC-seq: Nextera adapter explanation by FD

Let's just focus on the color code since the orientation of the pieces is quite confusing. I assume you know the steps of adding Nextera adapter with transposase followed by a PCR to align and amplify the adapter with barcode and sequencing primer information.

The Nextera sequence for trimming is given as*:

Nextera_transposase_adapter_trimming

CTGTCTCTTATACACATCT

Nextera_transposase_adapter_trimming_reverse_complement

AGATGTGTATAAGAGACAG

What confused me was that the trimming sequence (gray) is only a substring (underlined) of the adapter sequence for PCR amplification and indexing. I was looking for the whole trimming sequence and could not find it in the adapter:

Adapter_sequence

CAAGCAGAAGACGGCATACGAGATTCGCCTTAGTCTCGTGGGCTCGGAGATGT

Index 1 PCR primer read

Index (variable)

Transposase adapter specific (Part 1)

Transposase adapter specific (Part 2)

The missing piece of information was that the Nextera transposase adapter for trimming is not the complete transposable sequence that gets aligned by the Tn5 transposase. The complete sequence looks like this:

Nextera_transposase_adapter

TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG

Nextera_transposase_adapter_reverse_complement

GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG

In conclusion, the Nextera trimming sequence is only a substring of the whole transposable DNA element which gets aligned to the DNA fragments by transposase. What I don't understand is why Illumina only uses the substring for trimming.

*https://support-docs.illumina.com/SHARE/AdapterSeq/illumina-adapter-sequences.pdf

PS: Transposases are such an interesting class of enzymes. One of my favorite papers during my Master's was about viral transposable elements in the human genome which get shuffled around when the epigenetic marks for repression are lifted during embryogenesis (https://www.nature.com/articles/nrg2072) .

@sreichl
Copy link
Collaborator Author

sreichl commented Dec 12, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

1 participant