Is there a mistake in script MixtureEstimationMethods.py? #81

xyw1 · 2020-11-30T10:49:08Z

Hi authors,
I was interested in how ipdSummary gives out the frac value, so I read the source code and I was confused about this function:

Why estimateSingleFraction is optimized to estimate mu1 ( which I guess is the mean IPD value if methylated bases)?

The text was updated successfully, but these errors were encountered:

obanerjee · 2020-12-01T18:11:37Z

Hello, Thanks for your interest and for closely looking at ipdSummary.py. I would like to try to help answer your question. The objective function in the optimization is mixModelFn, defined on line 81 of MixtureEstimationMethods.py: # Return value of mixture model log likelihood function def mixModelFn(self, p, a0, a1): tmp = (1 - p) * a0 + p * a1 return -np.log(tmp[np.nonzero(tmp)]).sum() # return -np.ma.log( tmp ).sum() The input arguments are: p = methylated fraction a0 = value of Gaussian pdf for un-methylated IPDs: N( mu0, sd0 ) a1 = value of Gaussian pdf for methylated IPDs: N( mu1, sd1 ) When fminbound is used to do single-variable optimization for a function like mixModelFn, the optimization is done only over the first input argument – in this case, p. The remaining arguments (a0 and a1) are passed to fminbound as fixed values. Now, here is estimateSingleFraction, defined in line 96 of MixtureEstimationMethods.py: The input arguments are: mu1 = mean of log IPDs for methylated case data = observed log IPDs mu0 = mean of log IPDs for un-methylated case L = number of observations in the input data optProp = if set to True, returns optimal mixing proportion. 1. First, the observed log IPDs and the fixed values of mu1 and mu0 are used to calculate the values of the Gaussian pdf N( mu0, sd0 ) and N( mu1, sd1 ) 2. Now, single-variable optimization is done using the objective function defined in mixModelFn. The values of a0 and a1 are fixed, and the variable is the mixing proportion. # Return optimum argument (mixing proportion) of mixture model log likelihood function. def estimateSingleFraction(self, mu1, data, mu0, L, optProp=True): # NOTE: ignoring the warnings here is sloppy, should be looked at later. with np.errstate(all="ignore"): a0 = self.replaceScipyNormPdf(data, mu0) a1 = self.replaceScipyNormPdf(data, mu1) # if f'(0) < 0 (equ. a1/a0 < L), then f'(1) < 0 as well and # solution p-hat <= 0 if np.divide(a1, a0).sum() <= L: res = 0.0 # if f'(1) > 0 (equ. a0/a1 < L), then f'(0) > 0 as well and # solution p-hat >= 1 elif np.divide(a0, a1).sum() <= L: res = 1.0 else: # unconstrained minimization of convex, single-variable function res = fminbound(self.mixModelFn, 0.01, 0.99, args=(a0, a1), xtol=1e-02) if optProp: # return the optimal proportion return res else: # return the corresponding log likelihood function value return self.mixModelFn(res, a0, a1) Does this help at all? Please feel free to write back if not. Thanks again for your interest in this, From: xuyw <[email protected]> Sent: Monday, November 30, 2020 2:49 AM To: PacificBiosciences/kineticsTools <[email protected]> Cc: Subscribed <[email protected]> Subject: [PacificBiosciences/kineticsTools] Is there a mistake in script MixtureEstimationMethods.py? (#81) [EXTERNAL MESSAGE] Be mindful when clicking links or attachments Hi authors, I was interested in how ipdSummary gives out the frac value, so I read the source code and I was confused about this function: [Image removed by sender. image]<https://user-images.githubusercontent.com/28685819/100600251-02070180-333c-11eb-81d2-43c4f378619b.png> Why estimateSingleFraction is optimized to estimate mu1 ( which I guess is the mean IPD value if methylated bases)? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub<#81>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AANKTE7I4BD62YGDL6J6HN3SSN2DJANCNFSM4UHMSQSA>.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there a mistake in script MixtureEstimationMethods.py? #81

Is there a mistake in script MixtureEstimationMethods.py? #81

xyw1 commented Nov 30, 2020

obanerjee commented Dec 1, 2020 via email

Is there a mistake in script MixtureEstimationMethods.py? #81

Is there a mistake in script MixtureEstimationMethods.py? #81

Comments

xyw1 commented Nov 30, 2020

obanerjee commented Dec 1, 2020 via email