Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ideas to simplify the generative model for variant data #252

Open
SamuelBrand1 opened this issue Sep 24, 2024 · 7 comments
Open

Ideas to simplify the generative model for variant data #252

SamuelBrand1 opened this issue Sep 24, 2024 · 7 comments
Labels
enhancement New feature or request experiment some sort of experiment to better understand a part of the model

Comments

@SamuelBrand1
Copy link
Contributor

Why this issue?

In f2f discussion, we talked about the difficulty in having a generative model for both weekly hospitalisation and bi-weekly variant frequencies simultaneously. There are a few possible reasons why this is hard to achieve, for example, the variant frequencies are from a region and might not represent the local dynamics of a state.

On the other hand, the intrinsic dynamics of the model requires new variants as the primary mechanism whereby a new wave is trigger; that is the mechanistic assumption is that new waves are primarily driven by immunity loss.

Simpler generative models for variants

We need a model for the arrival of a new variant because its intrinsic to the dynamics, but we can simplify what this model is generating.

The simplest model I can think of that gives us what we need for the dynamics, and connects to the data, has these features:

  • Each variant $v = 1, 2, 3, 4,...$ arrives sequentially in the model at time $T_v$ with interval times $T_{v+1} - T_v \sim \text{Exp}(\lambda_{nv})$ where $\lambda_{nv}$ is a baseline rate of new variant arrival.
  • At the $T_v$ point we add a new variant:
    1. Generate $\chi_{i,v}$ for $i < v$ for the cross-immunity of the new variant with the older variants (I think we have some prior for this already?).
    2. Set an initial number of infected people with the new variant (I think we have some prior for this already?).
  • We need to define what an arrival of a new variant looks like in the data. This could be something like first time the variant frequency hits some value (e.g. 5%) with some growth rate (maybe?). Then we can assign a time to each variant invasion in the data (for each region): $T_{\text{obs},v}$.
  • The residual between the data invasion time and the model invasion time could be just Normal:
$$T_{\text{obs},v} - T_v \sim \mathcal{N}(0,~ \sigma_{nv}).$$
  • We can probably assume that no variant is just missed in the data; therefore, for looking backwards for parameter inference we set the number of variant as the number observed in the data. Projecting forwards I think mostly we'll want to condition on the variant outcomes (e.g. as defined scenarios).
@SamuelBrand1 SamuelBrand1 added the enhancement New feature or request label Sep 24, 2024
@SamuelBrand1
Copy link
Contributor Author

If we go with kind of model, e.g. fit on the introduction times rather than the whole variant time series, then I would suggest not including $\sigma_{nv}$ in the inference. We can get an estimate of that for each state by looking at difference between the regional variant introduction time and when we see a growth in hosps (if any growth occurs).

@arik-shurygin arik-shurygin added the experiment some sort of experiment to better understand a part of the model label Oct 17, 2024
@SamuelBrand1
Copy link
Contributor Author

Adding options: We could consider the Dirichlet-multinomial which is like the "noisey" version of the multinomial.

@SamuelBrand1
Copy link
Contributor Author

Fitting on the early growth rate (rather than introduction time)

Another alternative discussed f2f. Instead, of i) fitting on the full variant frequency time series or ii) just the introduction time (see above), we could connect the early exponential growth in new variant freq to growth in the hospitalisation following from the new variant.

@seabbs
Copy link

seabbs commented Oct 23, 2024

I really like all of these ideas (though noting I don't have a totally clear view of what is in currently).

A few comments:

we could connect the early exponential growth in new variant freq to growth in the hospitalisation following from the new variant.

Fitting to the early exponential growth seems like a strong option. The caveat here is that often the early growth is higher than the long run growth due to biased sampling etc.

n the other hand, the intrinsic dynamics of the model requires new variants as the primary mechanism whereby a new wave is trigger; that is the mechanistic assumption is that new waves are primarily driven by immunity loss.

It sounds like possible model fit times are coming from model data mismatch. Could it also be the case that the current model is too rigid (especially if it is making hard assumptions about the link between variants and waves). Perhaps adding a little more flexibility here could help without major model changes (if this isn't the case it will just further slow things down as the model will be more complex)

@SamuelBrand1
Copy link
Contributor Author

It sounds like possible model fit times are coming from model data mismatch. Could it also be the case that the current model is too rigid (especially if it is making hard assumptions about the link between variants and waves)

Yes, I think this is a problem which has been revealed in posterior predictive plotting (sorry I don't have a link as I've only seen them f2f).

@SamuelBrand1
Copy link
Contributor Author

An extra consideration is that whilst the hospitalisation data is specific to the modelling scale (that is state level). The variant freq data is actually pooled across CDC region, I think this underlines that a more flexible generative approach would be a good idea here.

@seabbs
Copy link

seabbs commented Oct 23, 2024

#itssorandom

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request experiment some sort of experiment to better understand a part of the model
Projects
None yet
Development

No branches or pull requests

3 participants