-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does the calculation of sval in encode_task_macs2_signal_track_chip.py take into account the difference between single-end and paired-end data? #298
Comments
We process SE and PE reads as SE with MACS2. i.e. for PE, we use both ends
of the fragment as separate reads. For SE, just the one end. MACS2 is not
run in paired-end mode for PE reads. Both ends of the fragment are present
in the tagAlign and presented to MACS2 in SE mode.
So sval for SE and PE data would based on counting all reads (for PE both
ends of each frag and for SE one end).
sval = min(no. of chip reads, no. of control reads) / 1e-6
…On Fri, Sep 1, 2023 at 1:57 AM ezioljj ***@***.***> wrote:
I am in the process of using your innovative scripts for ChIP-seq data
processing (many datasets containing both single-end and paired-end). While
going through the pipeline, I noticed that there doesn't appear to be a
clear distinction made between single-end ChIP-seq data and paired-end
ChIP-seq data during sval calculation
(encode_task_macs2_signal_track_chip.py, line 145). In my understanding,
the tagAlign files generated by this pipeline might differ between
single-end and paired-end data.
For single-end tagAlign files, the read count should be equal to the total
number of lines in the file, while for paired-end tagAlign files, the read
count should be half of the total number of lines in the file. However, I'm
not entirely certain if my interpretation is correct. I would greatly
appreciate your assistance in clarifying this matter. Thank you once again
for creating such valuable tools.
—
Reply to this email directly, view it on GitHub
<#298>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABDWEMQ2PZDM6VQKA3MKB3XYGPQDANCNFSM6AAAAAA4HIANUY>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Thank you for your explanation. I have two additional questions I would like to ask:
|
I am in the process of using your innovative scripts for ChIP-seq data processing (many datasets containing both single-end and paired-end). While going through the pipeline, I noticed that there doesn't appear to be a clear distinction made between single-end ChIP-seq data and paired-end ChIP-seq data during sval calculation (encode_task_macs2_signal_track_chip.py, line 145). In my understanding, the tagAlign files generated by this pipeline might differ between single-end and paired-end data.
For single-end tagAlign files, the read count should be equal to the total number of lines in the file, while for paired-end tagAlign files, the read count should be half of the total number of lines in the file. However, I'm not entirely certain if my interpretation is correct. I would greatly appreciate your assistance in clarifying this matter. Thank you once again for creating such valuable tools.
The text was updated successfully, but these errors were encountered: