Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

genomicAnnotationPriority ChIPseeker v1.36.0 #223

Open
HAOXUANmogu opened this issue Oct 23, 2023 · 6 comments
Open

genomicAnnotationPriority ChIPseeker v1.36.0 #223

HAOXUANmogu opened this issue Oct 23, 2023 · 6 comments

Comments

@HAOXUANmogu
Copy link

Hi,

met a problem with ChIPseeker recently.

The first one is the region priority problem with "genomicAnnotationPriority"

My question is:
when I use genomicAnnotationPriority = c("3UTR", "5UTR", "Promoter", "Exon", "Intron", "Downstream", "Intergenic"), the annotation file shows both 3'UTR and 5UTR region;

when I use genomicAnnotationPriority = c("Exon", "Intron", "3UTR", "5UTR", "Promoter", "Downstream", "Intergenic"), the annotation file shows neither 3'UTR nor 5UTR region;

The second one is the strand problem with "sameStrand = TRUE", it seems not working.

Here is my code list below:

library(ChIPseeker)

library(GenomicFeatures)

tair_10 <- makeTxDbFromGFF("TAIR10.release55.gtf")

peak <-readPeakFile("test.tsv")

peakAnno <-annotatePeak(peak, tssRegion=c(-3000,3000),TxDb = tair_10,
                        assignGenomicAnnotation = TRUE,
                        genomicAnnotationPriority = c("3UTR","5UTR","Promoter","Exon", "Intron","Downstream", "Intergenic"),
                        annoDb = NULL,
                        addFlankGeneInfo = FALSE,
                        flankDistance = 5000,
                        sameStrand = TRUE,
                        #ignoreOverlap = FALSE,
                        #ignoreUpstream = FALSE,
                        #ignoreDownstream = FALSE,
                        overlap = "all",
                        verbose = TRUE)

peakAnno_cluster <-as.data.frame(peakAnno)

#查看summary信息,peaks在基因组上的位置
peakAnno
plotAnnoPie(peakAnno)

test.tsv.zip
TAIR10.release55.gtf.zip

@HAOXUANmogu HAOXUANmogu changed the title genomicAnnotationPriority genomicAnnotationPriority ChIPseeker v1.36.0 Oct 23, 2023
@MingLi-929
Copy link
Contributor

Thank you for reaching out!
It seems that there is something wrong with your sample test.tsv file.

image

and there will be bug when running your code at the peak <-readPeakFile("test.tsv") , which come from the wrong format of tsv

image

@HAOXUANmogu
Copy link
Author

Ok, I should move the first lane to the last, please try the new one, I have just tried the new form, it is working
testnew.tsv.zip

@MingLi-929
Copy link
Contributor

Thank you for your feed back!
There is still something wrong with your file, and i correct it for you according to my understandings. Please check whether if this file can represent your information.
i correct the format according to standard of bed file(https://genome.ucsc.edu/FAQ/FAQformat.html#format1)
image
test.bed.txt

It would be helpful to me if you can provide me some information about your file. It seems that it is an output of methylation ? But it is a little different from the regular methylation out. If it is something like methylation sequencing having peak of one base, the file should be like
image

Since ChIPseeker analysis data based on the data structure of bed file, a correct input based on your actual need is important.

@HAOXUANmogu
Copy link
Author

Yes, it is an output of methylation, this is just a demo of the input file, a form like I need to use, it is not the real output data, you can adjust it to any format you need, and I can follow you to adjust my data/

@HAOXUANmogu
Copy link
Author

I have tried your bed file, you have moved the strand to the sixth lane, but it still not working, it still show"*"

This is the annotated form I got:

anno_test.bed.txt

@MingLi-929
Copy link
Contributor

Thank you for your feedback!
For question you mention, the meaning of genomicAnnotationPriority is that a region can only have one annotation according to your need, which means that it can only be 5'UTR or exon. You can check other annotation in this way.

peakAnno <-annotatePeak(peak, tssRegion=c(-3000,3000),TxDb = tair_10,
                        assignGenomicAnnotation = TRUE,
                        genomicAnnotationPriority = c("Exon", "Intron", "3UTR", "5UTR", "Promoter", "Downstream", "Intergenic"),
                        annoDb = NULL,
                        addFlankGeneInfo = FALSE,
                        flankDistance = 5000,
                        sameStrand = FALSE,
                        #ignoreOverlap = FALSE,
                        #ignoreUpstream = FALSE,
                        #ignoreDownstream = FALSE,
                        overlap = "all",
                        verbose = TRUE)

detail <- peakAnno@detailGenomicAnnotation
table(detail$fiveUTR)
#r$> table(detail$fiveUTR)
#
#FALSE  TRUE 
#15392  1164 

And for the strand information, we will update the function in the near future.
you can try to add strand information using

# df is the data.frame obtained from bed file
# column x is the column containing strand information
strand(peak) <- df[,x]

and the you can perform your analysis with strand information. sameStrand will work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants