Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dcm ngm primer #53

Merged
merged 29 commits into from
Jan 21, 2025
Merged
Changes from 27 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
803adb1
rough split of sections for primer
dinacmistry Dec 11, 2024
4426ee5
NGM for R0 vs simulation tool
dinacmistry Dec 13, 2024
0f5b172
conditions and limitations
dinacmistry Dec 31, 2024
1e4773e
dfe example
dinacmistry Dec 31, 2024
09364b4
example and formalism added
dinacmistry Jan 2, 2025
92d486a
connecting the primer example with the ngm model use in this repo
dinacmistry Jan 2, 2025
ef458ff
frequency dependent definition is better for differing population sizes
dinacmistry Jan 2, 2025
2f0b583
removing different population sizes content, still need to adjust for…
dinacmistry Jan 2, 2025
3bc5762
changing example formulation to map to repo model with different beta…
dinacmistry Jan 3, 2025
52464eb
trying to fix math rendering issues
dinacmistry Jan 6, 2025
466b795
proper latex for matrix
dinacmistry Jan 14, 2025
00aa3a2
Add how vaccination alters the proportion of susceptibles
dinacmistry Jan 15, 2025
867edab
Editing language around matrices
dinacmistry Jan 15, 2025
4a9ce78
pre-commit fix to github commits
dinacmistry Jan 15, 2025
7d740eb
not keeping everything, trying different ways to render matrices
dinacmistry Jan 15, 2025
cc5bf66
fixing mathbf calls
dinacmistry Jan 17, 2025
81105e1
writing matrices and vectors with math block for now, change back to …
dinacmistry Jan 17, 2025
abf1512
comment on large population requirement
dinacmistry Jan 17, 2025
927d931
made conditions and limitations into a list, added additional remark …
dinacmistry Jan 21, 2025
27136e5
updating branching process language
dinacmistry Jan 21, 2025
6f962f7
parentheses instead of square brackets, clearer language about infect…
dinacmistry Jan 21, 2025
b8646e9
R not K
dinacmistry Jan 21, 2025
8fff294
made linearization step in example more explicit
dinacmistry Jan 21, 2025
d2c029c
trying matrices with a math block?
dinacmistry Jan 21, 2025
f189c91
note about identity matrix
dinacmistry Jan 21, 2025
a4ba549
using math blocks for matrices
dinacmistry Jan 21, 2025
2b12aa2
note about multiple infectious states possible for NGM
dinacmistry Jan 21, 2025
199cde1
link to large domain NGM reference
dinacmistry Jan 21, 2025
4cd2eb0
spelling
dinacmistry Jan 21, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
99 changes: 99 additions & 0 deletions docs/primer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# A Primer on Next Generation Matrix Models

A Next Generation Matrix model is a way to model the expected number of infections generated by a typical infected individual in different groups or categories of the population in consecutive generations. The Next Generation Matrix (hereafter referred to as the NGM) encodes this information. NGM models are an effective way to model average dynamics in a heterogeneous population during the early growth phase and in the limit of the disease-free equilibrium.
dinacmistry marked this conversation as resolved.
Show resolved Hide resolved

An NGM model is related to the branching process concept of an offspring distribution generated by an individual. In this context, with multiple types of individuals, the NGM represents the expected value of the (conditional) offspring distributions from each group to each group. That is, it provides the average number of infections a typical individual in one group will cause in another.

Some classic works on NGMs are:

Diekmann, O., Heesterbeek, J.A.P. & Metz, J.A.J. On the definition and the computation of the basic reproduction ratio $R_0$ in models for infectious diseases in heterogeneous populations. J. Math. Biol. 28, 365–382 (1990). https://doi.org/10.1007/BF00178324

Diekmann O, Heesterbeek JA, Roberts MG. The construction of next-generation matrices for compartmental epidemic models. J R Soc Interface. 2010 Jun 6;7(47):873-85. https://doi.org/10.1098/rsif.2009.0386. Epub 2009 Nov 5. PMID: 19892718; PMCID: PMC2871801.

van den Driessche P, Watmough J. Reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission. Math Biosci. 2002 Nov-Dec;180:29-48. doi: https://doi.org/10.1016/s0025-5564(02)00108-6. PMID: 12387915.

This primer is meant to supplement these works and articulate how the NGM can be used in a transmission model in addition to being an analytical tool.

## Use of NGM as a model
Most commonly, NGMs are used in infectious disease modeling as an analytical tool to estimate the potential for growth of a disease in a population. NGMs are particularly useful for this when a population can be split into a finite number of discrete categories with different epidemiologically relevant traits. In that case, we can define the NGM and use it calculate the basic reproduction number $R_0$, a quantity that can provide insight about the early growth of a disease in a population and interventions that may be effective at controlling its growth. $R_0$ can be computed as the spectral radius of the NGM.

As a result, most modelers familiar with NGMs have experience with using them as an analytical tool rather than as a simulation tool. However, NGMs can also be used to approximately model the ODEs for the subsystem of infected states.

## Interpretation of matrix elements
Imagine we have an NGM, $\mathbf{R} = \left(R_{ij}\right)$. The elements $R_{ij}$ of this matrix can be interpreted as the average number of infections in group $i$ caused by an infected individual in group $j$ between consecutive generations in a fully susceptible population. As a rule of thumb, the matrix $\mathbf{R}$ is not symmetric; some groups may be more susceptible to infection or more transmissive resulting in an asymmetric $\mathbf{R}$.

## Formal definition
For a system of differential equations describing infectious disease dynamics, we can identify the infected subsystem that describes the production of new infections and other changes in state of infected individuals. After linearizing around the DFE, we can decompose the infected subsystem into 2 parts representating rates of transmission and transition. It is common to see the transmission component referred to as $\mathbf{T}$, the transmission matrix, and the transition component referred to as $\mathbf{\Sigma}$, the transition matrix. The Next Generation Matrix with Large domain is then defined as $\mathbf{R_L} = -\mathbf{T}\mathbf{\Sigma}^{-1}$.
dinacmistry marked this conversation as resolved.
Show resolved Hide resolved

The NGM $\mathbf{R}$ is the restriction of $\mathbf{R_L}$ to the subset of states-at-infection. An auxiliary matrix $\mathbf{E}$ can be defined whose columns are unit vectors for each non-zero row of the matrix $T$. The NGM can then be computed as $\mathbf{R} = -\mathbf{E}'\mathbf{T}\mathbf{\Sigma}^{-1}\mathbf{E}$, $\mathbf{E}'$ is the transpose of $\mathbf{E}$. It can be shown that the spectral radius of $\mathbf{R_L}$ is equal to that of $\mathbf{R}$ and that this spectral radius is $R_0$.

In most cases, more intuitive approaches can be used to define the NGM, however the formal definition of $\mathbf{R}$ has its advantages in being more rigorous and and helping modelers identify relevant information for estimating growth dynamics.

## Conditions and limitations

Some conditions and limitations apply for NGM models to be a valid tool for estimating $R_0$ or as a simulation tool.

* __Discrete states__: The model population must be able to be divided into discrete compartments or states that are epidemiologically relevant. These strata may reflect heterogeneities in susceptibility, such as age, or health state, such as infectious and symptomatic vs. infectious and asymptomatic.
* __Disease-free equilibrium__: The NGM is defined by identifying transmission and transition dynamics of an infectious disease model near the disease-free equilibrium (DFE) and linearizing the system around that point. A disease-free equilibrium is a point the epidemiological system where the population is free of disease, i.e., at a DFE the infectious population is zero. There can be multiple DFE for a system; the NGM is defined at the point where the population is fully susceptible. For example, in the classic SIR model, there exists a DFE with the conditions ($S \approx N$, $I \approx 0$, $R = 0$), which leads us to the condition $$R_0 = \frac{\beta}{\gamma} \geq 1$$ for growhth of disease in the population when we linearize the system around that point. Another DFE exists at the point where ($S = 0$, $I = 0$, $R = N$), however this DFE is not epidemiologically relevant to disease dynamics since disease cannot grow at this point.
* __Depletion of susceptibles__: NGM models describe infectious disease dynamics as a demographic process in the sense that each consecutive generatino produces new offspring infections. This can be a good approximation for dynamics early on and in the limit of a large, otherwise fully susceptible population, such that stochastic effects are negligible. However, unlike ODE models, an NGM model does not account for the fixed size of a population and cannot model the depletion of susceptibles over time.
dinacmistry marked this conversation as resolved.
Show resolved Hide resolved
* __Other conditions__: Entries of the NGM must be non-negative to guarantee that $R_0$ will be a single unique, positive real-valued eigenvalue of $\mathbf{R}$. In Diekmann et al. (2010), the authors note additional requirements: `For completeness we remark that in the decomposition T + Σ it is essential only that T is a non-negative matrix and that Σ is a positive off-diagonal matrix with spectral bound s(Σ)< 0`.

## A motivating example
The following is an example borrowed from Keeling & Rohani (2008, pp 57-63). Here, we go into depth of a modified version with additional insights from Diekmann et. al (2010) to arrive at the NGM model of the system.

Consider the scenario of a disease spreading in a population with two categories of individuals. These two groups are differentiated by their risk for acquiring infection; there is a high-risk (H) and a low-risk (L) group. The disease progression can be described using an SIR compartmental model. An NGM is an effective way of approximating the early disease dynamics for heterogeneous systems like this. For the purposes of this example, we are considering a model with only one infectious state, but an NGM can written for models with multiple infectious states like asymptomatic and infectious as well as symptomatic and infectious.

We denote the number of individuals in the high-risk group as $N_H$, and the number of individuals in the low-risk group as $N_L$. $X_H$ is the number of people in group $H$ who are in state $X$, and the total number of people in state $X$ = $X_H + X_L$. States in this model are $S$ for susceptible, $I$ for infected and infectious, and $R$ for recovered. Thus, we have $S_i + I_i + R_i = N_i$ for all subpopulations $i$ and $N = \sum_i N_i$ for a total fixed population size.

We also assume that average mixing holds for all individuals between the groups and within, i.e. no individual in either group has different contact rates than others in their group. Individuals in the two risk groups can interact with each other in some way such that an infectious individual would generate some number of new infections in the two groups. More specifically, an average infected individual in group $j$ generates $\beta_{ij}$ infections per unit time in group $i$ in a fully susceptible population.

Unlike the example in Keeling & Rohani, here we model the counts of the population in each state rather than the proportion. We are also modeling the effective rate of transmission between groups as split into two factors: a rate of transmission from group $j$ to group $i$, $\beta_{ij}$ and a rate of interaction based on the number of people in the population available for contact with infectious individuals, i.e., $\frac{S_i}{N}$. This follows from the frequency dependent assumption where effective contact structure that generates transmission is independent of population size (the interested reader can refer to Keeling & Rohani, 2008 pp 17-18 for more details).

At any given time, there is some fraction of the population that is susceptible in group $i$ and can be infected through interaction with an infected individual in group $j$. Then the average number of infections generated in group $i$ by an infected individual in group $j$ is $\frac{\beta_{ij}S_i}{N}$ per unit time. Assuming no collision of transmission events, $I_j$ infected individuals produce $\frac{\beta_{ij}S_i I_j}{N}$ infections per unit time.

Individuals in each risk group also recover from infection at some rate $\gamma_i$. Here we assume that individuals in both groups recover at the same rate, however the following can be generalized to scenarios where average recovery rates of the two groups are different.

Now we can write the infected subsystem of differential equations as

$\frac{d I_H}{dt} = \frac{\beta_{HH}S_H I_H}{N} + \frac{\beta_{HL}S_H I_L}{N} - \gamma I_H$

$\frac{d I_L}{dt} = \frac{\beta_{LH}S_L I_H}{N} + \frac{\beta_{LL}S_L I_L}{N} - \gamma I_L$

or more concisely as

$\frac{d I_i}{dt} = \sum_{j} \frac{\beta_{ij}S_i I_j}{N} - \gamma I_i$

Linearizing the system at the DFE where $S_i \approx N_i$, we can write

$\frac{d I_i}{dt} = \sum_{j} \frac{\beta_{ij}N_i I_j}{N} - \gamma I_i$


From here we can decompose the system into transmission and transition components, $\mathbf{T}$ and $\mathbf{\Sigma}$, respectively.

Let
```math
\mathbf{x} = \begin{pmatrix}I_H\\I_L\end{pmatrix}
```

$$\mathbf{T} = \left(T_{ij}\right)$$
with $T_{ij} = \frac{\beta_{ij}N_i}{N}$
and
$$\mathbf{\Sigma} = -\gamma \mathbb{I}_2$$

where $\mathbb{I}_2$ is the identity matrix with dimension 2. Then we can write the infected subsystem as $\mathbf{\frac{dx}{dt}} = (\mathbf{T} + \mathbf{\Sigma})\mathbf{x}$. The NGM can be defined as $R = -\mathbf{E}'\mathbf{T}\mathbf{\Sigma}^{-1}\mathbf{E}$.

For this system, the auxiliary matrix is
```math
\mathbf{E} = \begin{pmatrix}1 & 0\\0 & 1\end{pmatrix}
```
with unit vector
```math
\begin{pmatrix}1\\0\end{pmatrix}
```
for the state $I_H$ and unit vector
```math
\begin{pmatrix}0\\1\end{pmatrix}
```
for the state $I_L$ in the transmission matrix $\mathbf{T}$.

Then the NGM can be defined as $\mathbf{R}$ with elements $R_{ij} = \frac{\beta_{ij}N_i}{\gamma N}$. This is the formulation used for the input NGM in the widget, noting the implicit assumption that the user has provided entries to the input NGM that factor in population sizes. Vaccination alters the proportion of susceptible individuals that may become infected in each group, thus the rows of the input NGM are multiplied by the remaining proportion susceptible after vaccination.
Loading