Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update dynamic genotype conf threshold calculation #60

Closed
iqbal-lab opened this issue Apr 3, 2019 · 4 comments
Closed

Update dynamic genotype conf threshold calculation #60

iqbal-lab opened this issue Apr 3, 2019 · 4 comments

Comments

@iqbal-lab
Copy link
Collaborator

iqbal-lab commented Apr 3, 2019

Looking at a very high coverage nanopore sample, we have found a good example where there is a very high confidence call, in agreement with phenotype, but which is below the dynamically calculated threshold.

INFO:mykrobe.cmds.amr:Confidence cutoff (using percent cutoff 90%): 12033

and the variant had confidence 11957.

The relevant bit of the JSON is

"rpoB_H445X-CAC761139GAC":{  
   "variant":"ref-H445X?var_name=CAC761139GAC&num_alts=3&ref=NC_000962.3&enum=0&gene=rpoB&mut=H445X",
   "genotype":[  
      1,
      1
   ],
   "genotype_likelihoods":[  
      -12045.524099907143,
      -88.976184884562
   ],
   "info":{  
      "coverage":{  
         "reference":{  
            "percent_coverage":5.0,
            "median_depth":0,
            "min_non_zero_depth":1,
            "kmer_count":1,
            "klen":21
         },
         "alternate":{  
            "percent_coverage":100.0,
            "median_depth":63,
            "min_non_zero_depth":53,
            "kmer_count":1314,
            "klen":21
         }
      },

So what is going on? At this high depth, the whole genotype confidence distribution is shifted far to the right, and the default threshold which is set at keeping 90% of samples, is too strict. at this depth, almost everything is going to be good.

What's the fix? well, i guess the bloody percentile threshold we choose (ie the 90%), should go up with depth

@mbhall88
Copy link
Member

mbhall88 commented Apr 4, 2019

If the percentile threshold goes up with depth wouldn't that mean that this variant would be further away from the confidence cut-off?

@iqbal-lab
Copy link
Collaborator Author

no the number going up is the area under the curve. we're saying accept any genotype confidence in the top x% of the area under the curve. currently we accept any genotype confidence in the top 90%, and i'm saying for high coverage we could accept anything in the top 99% or something

@mbhall88
Copy link
Member

This is likely related to #121

@mbhall88
Copy link
Member

Closing in favour of #121

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants