-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mismatch filter #30
base: master
Are you sure you want to change the base?
mismatch filter #30
Conversation
can you make this opt-in, so that the default is 1, meaning it has no effect and only do any extra work if the value is < 1? also, please make the argument "--mismatch-ratio" instead of "--mismatch_ratio". |
done. |
what's with the MC and MP tags? |
stores the original mapping chromosome and position: MC:Z: (chromosome) and MP:Z: (position) just so-that it's not lost. You think it's problematic for downstream analysis? |
I guess that's fine. I would prefer not to set CHROM and POS to bad values, just leave the originals and set the flag. |
I need CHROM and POS to be set to * and 0 for what I do downstream. Also I think some tools might be confused if the flag says "unmapped", but then there's mapping positions. |
I don't think I want to set POS and CHROM. Can you filter on the flag, rather than on the mapping? |
I do filter on the flag, but I also summarize mapping positions. And I want the reads with too many mismatches to be counted as unmapped. To me it makes more sense this way. Do you have any specific concerns about changing chrom and pos? If yes I might reconsider. |
I added the functionality of a mismatch filter. There is now a new parameter (mismatch_ratio) which specifies the maximum acceptable ratio of mismatches to alignment length. Reads which have to many mismatches are reported as qc-failed (0x200) and unmapped (0x4). Chromosome and mapping position are set to * and 0 respectively and the originally reported values are stored in two extra fileds MC:Z: (chromosome) and MP:Z: (position). I implemented this, because I need more control over which reads are considered "mapped". Might be useful for others, too.