ADD BADGES
rEpiabm enables users familiar with R to use Epiabm. Epiabm is a simulation tool that models the progress of an epidemic across a specified region of interest within a specific timeframe. It has been developed in python (PyEpiabm) for small-scale implementations and C++ (cEpiabm) for fast, large-scale simulations. PyEpiabm has a modular design, with many options to configure specific requirements.
To model an epidemic, contact events occur within the population spatial structure (see Figure 1.). A compartmental model is used for the progression of the disease within-host (see Figure 2.). These two architectures are highly configurable which allows us to study a wide range of simulation scenarios.
data:image/s3,"s3://crabby-images/ebbbe/ebbbe51db289153609931883e70c78db581f1802" alt="Population spatial structure"
data:image/s3,"s3://crabby-images/bc740/bc740fa5e38de7155258bafc34c700bd2b978cc6" alt="Infection progression"
The basic flow of a simulation is described below. More detailed, complex examples are illustrated in this Jupyter notebook using Epigeopop to extract the data and in this Jupyter notebook for a toy population to experiment with. We give instructions to run a basic simulation for both a toy population and a population extracted by EpiGeoPop, using 'Andorra' as an example of the region of interest. Also, the Wiki details optional parameters available to the user as well as those whose values are mentioned, but changing them is not recommended.
Before running a simulation, rEpiabm needs to be installed with all dependencies mentioned in the DESCRIPTION file. Also, the input folder structure used by the R program file needs to be set up.
Instructions:
- Clone the Github rEpiabm repository
- Create a GitHub Personal Access Token (fine-grained)
- Configure RStudio with your token
- Install required R packages
install.packages("devtools")
devtools::install_github("SABS-R3-Epidemiology/rEpiabm")
-
You now have two different simulation options:
5.1 An Epigeopop based simulation
This uses real data to create the Population spatial structure*: Copy the example
Andorra
folder structure within the data folder and name it with your region of interest. Include the.json
file as you will need to edit this later for your simulation.OR
5.2 A toy simulation
Users can specify population parameter values (usually small quantities) to create the Population spatial structure*: Copy the example
toy
folder structure within the data folder and name it with your region of interest. Include the.json
file as you will need to edit this later for your simulation.
You are now ready to generate or configure the population for your simulation.
As shown in Figure 1, the region of interest is broken into a spatial structure:
- Cells - largest areas, based on a fixed width
- Microcells - cells are split into microcells which contain smaller areas containing individuals
- Households - all individuals are assigned to one household and do not move households during the simulation.
- Places - these are spaces where individuals might meet other individuals from different households, a workplace or a public park for example.
Important
Follow 2.1 Using EpiGeoPop instructions for extracting a population using Epigeopop or 2.2 Using User-defined values for a toy population.
2.1 Using EpiGeoPop
The structure is created using EpiGeoPop. The user states a region of interest, Oxford or UK for example, and the tool extracts information from Natural Earth and JRC, providing a csv file as output. This file contains one line per microcell for each cell, with the number of households, places and individuals to be used in the simulation (the quantity of individuals are extracted from Census data).
Instructions:
- Go to EpiGeoPop repository and follow the instructions to extract a csv file of your required region.
Note
The .json file which you amend to put the name your country also needs the proportion of households with 1 individual, 2 individuals, 3 individuals... up to 10 individuals. This information is usually found using census data (or equivalent) for your region. Amend the .json file as described in this Jupyter notebook.
- Copy the extracted file to the new folder
data/<your_country>/inputs
Caution
At the time of writing, the tool did not extract the data successfully. Please follow the instructions in this Jupyter notebook.
In summary, the spatial structure for a region is generated using EpiGeoPop. This tool exports into a csv file the number of households, places, and individuals for each microcell. It also produces a Population Density map in outputs/countries/<your_country>.pdf
, the example of Andorra is shown in Figure 3.
2.2 Using User-defined values
The user defines population values to generate a toy population spatial structure for the simulation. No region is specified but can be named to distinguish different simulation runs. This option is commonly used to play with different configurations of the simulation using small populations.
Instructions:
- Open simulation.R and amend the following parameters (these are the default values):
population_size = as.integer(100),
cell_number = as.integer(2),
microcell_number = as.integer(2),
household_number = as.integer(5),
place_number = as.integer(2)
population_seed = as.integer(42)
- Save simulation.R
Note
Please see more detailed instructions in this Jupyter notebook.
Once the data for your region has either been extracted or configured using user-defined values, the simulation can be configured. Again, follow 3.1 Using Epigeopop population to configure other simulation paramters for your region OR follow 3.2 Using a toy population.
3.1 Using Epigeopop population
The following parameters are essential and need to be stated by the user to run a simulation:
- Name of the path to the csv file from EpiGeoPop
- Number of infected individuals (Imild: see Figure 2.): enter the number of infected individuals at the start of the simulation.
- Proportion of households with 1 individual, 2 individuals, 3 individuals... up to 10 individuals. This information is usually found using census data (or equivalent) for your country.
- Duration of the simulation (in days)
- Select any output options required
Instructions:
- Open
Andorra_parameters.json
(copied fromAndorra
in Step 1 above) and save with<your_country>'s
name (keep first letter capitalised). - Amend the parameter array
household_size_distribution
to have your countries' distribution of households
Warning
Make sure these match the values used to extract your population data in Step 2.
- Open
simulation_epigeopop.R
and amend:
input_dir
: the absolute path to your csv file exported from EpiGeoPopinitial_infected
: enter the number of infected individuals at the start of the simulation.duration
: enter the time for the simulation to run (in days)Andorra
in final line: change to<your_country>
.
More detailed instructions are available in this Jupyter notebook and further optional parameters are described in the Wiki
3.2 Using a toy population The following parameters are essential and need to be stated by the user to run a simulation:
- Number of infected individuals (Imild: see Figure 2.): enter the number of infected individuals at the start of the simulation.
- Duration of the simulation (in days)
Instructions:
- Open
simulation_toy.R
and amend:
initial_infected
: enter the number of infected individuals at the start of the simulation.duration
: enter the time for the simulation to run (in days)
More detailed instructions are available in this Jupyter notebook and further optional parameters are described in the Wiki
Common adjustments:
- At the start, infected individuals are distributed across all cells by default, you may want to put them in one cell.
- Maximum infection radius: this sets a maximum distance for the infection to be able to spread from cell to cell
- Age distribution used is required
- Outputs to evaluate simulation (see Step 5: Evaluate Results)
Once configured, the simulation takes the generated population and performs the following ‘sweeps’:
Initialisation sweeps:
- InitialHouseholdSweep - Assign individuals to households
- InitialisePlaceSweep - Assign individuals to places
- InitialInfectedSweep - Assign which individuals are initially infected
Note
There are optional modules such as recording demographics, which are described in the Wiki
Simulation sweeps:
Individual’s location and infection status is updated each day:
- UpdatePlaceSweep - Account for movement of individuals by refreshing their ‘place’ assignments
- Check each infected individual to see if they infect others:
- HouseholdSweep - At a household
- PlaceSweep - At a place
- SpatialSweep - Between cells
- QueueSweep - Any successful infections will update the newly infected person’s status from S (Susceptible) to E (Exposed).
- HostProgressionSweep - Individual’s Infection progress is updated using the compartmental model
An overview of the program workflow is illustrated in Figure 4.
Figure 4. Overview of simulation workflow: These steps are required to run a simulation.Instructions:
- After saving the configured file, either
simulation_epigeopop.R
orsimulation_toy.R
, run this code!
A simulation produces one csv output file by default, found in the directory data/<your_country>/simulation_outputs
. This file contains the number of individuals for each infection status (S, E, Imild, etc) for each day.
It also produces three common plots:
- a SImildRD plot, which shows the overall progression of each status for the duration of the simulation.
- a Reproduction number, Rt, plot, which shows the trajectory of the recorded case reproduction number for the duration of the simulation.
- a Serial-Interval distribution plot, an approximation of the effective serial-interval distribution of the model population.
Further optional files are available, details described in the Wiki. These data files can be used to produce plots for further analysis.