-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Duplicate" kgID when doing annotation. #115
Comments
Hmm, that is an unusual situation with a gene listed on multiple
chromosomes -- interesting catch. Happy to entertain a pull request if you
have a better validation scheme.
…On Thu, Jun 23, 2022 at 4:37 PM Linghao Song ***@***.***> wrote:
I am trying to running the annotation (svaba-annotate.R) on GENCODE db.
However, the UCSC db records pulled by these 2 lines.
https://github.com/walaj/svaba/blob/0f60e366c300bbefbba762bcc6d2b661bd2ae74a/R/svaba-annotate.R#L59-L60
Are having duplicates. example:
kgID mRNA geneSymbol spID refSeq chrom txStart txEnd strand
1: ENST00000244174.11 NM_002186 IL9R Q01113 NM_002186 chrX 155997695 156010817 +
2: ENST00000244174.11 NM_002186 IL9R Q01113 NM_002186 chrY 57184215 57197337 +
This is making sense to me, that the sex chromosomes have different
position and share the some mRNA. But this will hit error at the following
line:
Error: !any(duplicated(genes$kgID)) is not TRUE
<https://github.com/walaj/svaba/blob/0f60e366c300bbefbba762bcc6d2b661bd2ae74a/R/svaba-annotate.R#L67>
Maybe we have have a better validation check at here?
—
Reply to this email directly, view it on GitHub
<#115>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABUZ7CG5OHOZAOFME3GMGODVQTDI7ANCNFSM5ZVTSDKQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Because we need both gencode and exonframe information, we loaded |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I am trying to running the annotation (svaba-annotate.R) on GENCODE db.
However, the UCSC db records pulled by these 2 lines.
svaba/R/svaba-annotate.R
Lines 59 to 60 in 0f60e36
Are having duplicates. example:
This is making sense to me, that the sex chromosomes have different position and share the some mRNA. But this will hit error at the following line:
Error: !any(duplicated(genes$kgID)) is not TRUE
Maybe we have have a better validation check at here?
The text was updated successfully, but these errors were encountered: