Parsing of CIGAR #101

chlee-tabin · 2020-02-24T19:09:23Z

Dear dropEst maintainers/developers:

I have been using the dropEst and found that in the code, dropEst actually does not parse out the CIGAR (it has a comment of "TODO"). Would this impact accurate mapping of the UMIs?

Three questions that I see it could matter is (if I understood the code correctly):

(1) it only checks the start and end coordinates which in some cases aligners map erroneously to a very distant part of the genome (with huge gap sequences in between) mapping/discarding the UMI erroneously. Would there be a fix of mapping the bulk aligned part to the gene?

(2) if I use -f option to use the tags from 10x cellranger, how does dropEst generate and count the intronic only UMIs (which are typically not tagged in cellranger generated .bam files)?

(3) Would there be a quick remedy of cases where the last coordinate is just outside an annotated 3'UTR, instead of dropEst categorizing the UMI to fall into HAS_NOT_ANNOTATED and discards it, to save it or even suggest a modified annotation?

Thank you so much for the package!

The text was updated successfully, but these errors were encountered:

evanbiederstedt added bug enhancement labels Sep 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parsing of CIGAR #101

Parsing of CIGAR #101

chlee-tabin commented Feb 24, 2020 •

edited

Loading

Parsing of CIGAR #101

Parsing of CIGAR #101

Comments

chlee-tabin commented Feb 24, 2020 • edited Loading

chlee-tabin commented Feb 24, 2020 •

edited

Loading