Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved treatment of doubling / halving time when growth rate estimate spans 0 #94

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

jrcpulliam
Copy link
Member

  • adds test to ensure doubling / halving time confidence bounds don't include negative values
  • identifies and fixes an issue with expect_equal_to_reference() that derives from a bug in testthat
  • issues an error when r estimates for groups have different signs
  • updates tests that relied on simulated data for which groups had different signs

Closes #28.

… of tests()) and regenerates reference files; passes all tests on second run of tests(); not committing new reference files
…_value() with update=FALSE; note that there is a bug in expect_equal_to_reference() such that update is reset from FALSE to TRUE (see testthat issue 683)
…s; need input from @thibautjombart on what expected behavior should be
…sired behaviour as a result of the change to doubling / halving interval CIs; committing new reference rds files
…ups that have growth rates with different signs
@jrcpulliam
Copy link
Member Author

Note that I'm concerned that this PR and #90 both introduce new error messages that might break backward compatibility.

@jrcpulliam
Copy link
Member Author

Also, @thibautjombart: what do you think about squashing commits (eg the ones where I'm just adding spaces?)

@zkamvar
Copy link
Member

zkamvar commented Jan 7, 2019

Note that I'm concerned that this PR and #90 both introduce new error messages that might break backward compatibility.

Can you explain what you mean by breaking backward compatibility? What breaks with #90 and this PR? I'm aware of the breaking change for character data and I might have a less break-y way of addressing that.

what do you think about squashing commits (eg the ones where I'm just adding spaces?

I don't know if there is a way to squash specific commits, either we take them all or squish them into one commit to rule them all.

@zkamvar
Copy link
Member

zkamvar commented Jan 8, 2019

@jrcpulliam, Thank you for the contribution. Sorry it's taken me a bit to get to this.

I have a couple of questions/comments on this:

  • In Error when confidence intervals for r go below 0 #28, @finlaycampbell states that the upper bound of doubling time should be Inf, however the code shows that negative values of both the doubling and halving time are being converted to Inf like so:
     |x ----0------------ y| ... Inf  // Before conversion
                         |y  ... Inf| // After conversion
    
    Is this the correct interpretation (i.e. that it's impossible to have a negative doubling/halving time so the upper limit becomes the lower limit)?
  • The tests currently rely on cached R objects to compare known values. I would like to move away from these for several reasons, including some of the issues you ran into.
  • What is the rationale for throwing an error when the groups have differing rates?

@thibautjombart
Copy link
Contributor

thibautjombart commented Jan 8, 2019

For the growth rate, in this situation the CI is an exclusion interval:

r <- seq(-3,3,le = 100)
plot(r, log(2)/r, type = "l")

image
(sorry reprex not working with emacs)

So we may want to output something along the lines of [-Inf ; lower_bound] U [upper_bound ; +inf]. Well, that's an option, happy to go with something else. @jrcpulliam what do you think?

I think part of the confusing stuff is we try to be clever in the output and report positive values as 'doubling times' and negative ones as 'halving times'. We could be more consistent and just report 'doubling/halving times', and explain that negative values are halving times?

Re the error: I think it comes from the above, e.g. one group may have r>0 and therefore a 'doubling time', while another would have r<0 and halving times reported.

@zkamvar
Copy link
Member

zkamvar commented Jan 9, 2019

I think part of the confusing stuff is we try to be clever in the output and report positive values as 'doubling times' and negative ones as 'halving times'. We could be more consistent and just report 'doubling/halving times', and explain that negative values are halving times?

I agree that the confusion is due to the way we handle the output. Currently, we try to detect whether r is negative or positive and then use that to determine if we want to use log(2) / r or log(0.5) / r to calculate the estimated time.

Re the error: I think it comes from the above, e.g. one group may have r>0 and therefore a 'doubling time', while another would have r<0 and halving times reported.

This is in part because we based the conditional only on the first r value, which can be changed. The situation that would cause this condition is not that infeasible and we should be able to address it.


All in all, I think that there is still room for us to give a sensible interpretation of the doubling/halving times. The way I see it, we calculate these values to make the lives of the users easier so that they have an easily interpret-able number they can report. We could do the following:

  1. Calculate doubling/halving time using log(2)/r
  2. Place the positive values and confidence intervals into $doubling
  3. Place the negative values and confidence intervals into $halving and multiply them by -1
  4. Find which r CIs cross zero and update the bounds relative to the sign of r
  5. When printing, add a note about the confidence interval of R crossing zero and the effects on the time estimates

@thibautjombart
Copy link
Contributor

Yup yup yup. I agree with all the above! :)

@zkamvar
Copy link
Member

zkamvar commented Jan 10, 2019

I'll have a go at this for the next next version of incidence. My priority right now is to get a working version on CRAN ASAP for #95 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error when confidence intervals for r go below 0
3 participants