Ideas to simplify the generative model for variant data #252

SamuelBrand1 · 2024-09-24T23:07:16Z

Why this issue?

In f2f discussion, we talked about the difficulty in having a generative model for both weekly hospitalisation and bi-weekly variant frequencies simultaneously. There are a few possible reasons why this is hard to achieve, for example, the variant frequencies are from a region and might not represent the local dynamics of a state.

On the other hand, the intrinsic dynamics of the model requires new variants as the primary mechanism whereby a new wave is trigger; that is the mechanistic assumption is that new waves are primarily driven by immunity loss.

Simpler generative models for variants

We need a model for the arrival of a new variant because its intrinsic to the dynamics, but we can simplify what this model is generating.

The simplest model I can think of that gives us what we need for the dynamics, and connects to the data, has these features:

Each variant $v = 1, 2, 3, 4,...$ arrives sequentially in the model at time $T_v$ with interval times $T_{v+1} - T_v \sim \text{Exp}(\lambda_{nv})$ where $\lambda_{nv}$ is a baseline rate of new variant arrival.
At the $T_v$ point we add a new variant:
1. Generate $\chi_{i,v}$ for $i < v$ for the cross-immunity of the new variant with the older variants (I think we have some prior for this already?).
2. Set an initial number of infected people with the new variant (I think we have some prior for this already?).
We need to define what an arrival of a new variant looks like in the data. This could be something like first time the variant frequency hits some value (e.g. 5%) with some growth rate (maybe?). Then we can assign a time to each variant invasion in the data (for each region): $T_{\text{obs},v}$.
The residual between the data invasion time and the model invasion time could be just Normal:

$$T_{\text{obs},v} - T_v \sim \mathcal{N}(0,~ \sigma_{nv}).$$

We can probably assume that no variant is just missed in the data; therefore, for looking backwards for parameter inference we set the number of variant as the number observed in the data. Projecting forwards I think mostly we'll want to condition on the variant outcomes (e.g. as defined scenarios).

SamuelBrand1 · 2024-09-24T23:21:42Z

If we go with kind of model, e.g. fit on the introduction times rather than the whole variant time series, then I would suggest not including $\sigma_{nv}$ in the inference. We can get an estimate of that for each state by looking at difference between the regional variant introduction time and when we see a growth in hosps (if any growth occurs).

SamuelBrand1 · 2024-10-22T21:55:20Z

Adding options: We could consider the Dirichlet-multinomial which is like the "noisey" version of the multinomial.

SamuelBrand1 · 2024-10-22T22:03:27Z

Fitting on the early growth rate (rather than introduction time)

Another alternative discussed f2f. Instead, of i) fitting on the full variant frequency time series or ii) just the introduction time (see above), we could connect the early exponential growth in new variant freq to growth in the hospitalisation following from the new variant.

seabbs · 2024-10-23T10:17:08Z

I really like all of these ideas (though noting I don't have a totally clear view of what is in currently).

A few comments:

we could connect the early exponential growth in new variant freq to growth in the hospitalisation following from the new variant.

Fitting to the early exponential growth seems like a strong option. The caveat here is that often the early growth is higher than the long run growth due to biased sampling etc.

n the other hand, the intrinsic dynamics of the model requires new variants as the primary mechanism whereby a new wave is trigger; that is the mechanistic assumption is that new waves are primarily driven by immunity loss.

It sounds like possible model fit times are coming from model data mismatch. Could it also be the case that the current model is too rigid (especially if it is making hard assumptions about the link between variants and waves). Perhaps adding a little more flexibility here could help without major model changes (if this isn't the case it will just further slow things down as the model will be more complex)

SamuelBrand1 · 2024-10-23T10:46:55Z

It sounds like possible model fit times are coming from model data mismatch. Could it also be the case that the current model is too rigid (especially if it is making hard assumptions about the link between variants and waves)

Yes, I think this is a problem which has been revealed in posterior predictive plotting (sorry I don't have a link as I've only seen them f2f).

SamuelBrand1 · 2024-10-23T10:48:51Z

An extra consideration is that whilst the hospitalisation data is specific to the modelling scale (that is state level). The variant freq data is actually pooled across CDC region, I think this underlines that a more flexible generative approach would be a good idea here.

seabbs · 2024-10-23T11:09:45Z

#itssorandom

SamuelBrand1 added the enhancement New feature or request label Sep 24, 2024

arik-shurygin added the experiment some sort of experiment to better understand a part of the model label Oct 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ideas to simplify the generative model for variant data #252

Ideas to simplify the generative model for variant data #252

SamuelBrand1 commented Sep 24, 2024

SamuelBrand1 commented Sep 24, 2024

SamuelBrand1 commented Oct 22, 2024

SamuelBrand1 commented Oct 22, 2024

seabbs commented Oct 23, 2024

SamuelBrand1 commented Oct 23, 2024

SamuelBrand1 commented Oct 23, 2024

seabbs commented Oct 23, 2024

Ideas to simplify the generative model for variant data #252

Ideas to simplify the generative model for variant data #252

Comments

SamuelBrand1 commented Sep 24, 2024

Why this issue?

Simpler generative models for variants

SamuelBrand1 commented Sep 24, 2024

SamuelBrand1 commented Oct 22, 2024

SamuelBrand1 commented Oct 22, 2024

Fitting on the early growth rate (rather than introduction time)

seabbs commented Oct 23, 2024

SamuelBrand1 commented Oct 23, 2024

SamuelBrand1 commented Oct 23, 2024

seabbs commented Oct 23, 2024