From 2fca52352ee63c1b8243de68f3dac915849bda32 Mon Sep 17 00:00:00 2001 From: Anita Applegarth Date: Mon, 20 Jan 2025 15:20:11 +0000 Subject: [PATCH] Updated links and formatted pictuers --- README.md | 25 +++-- walk_through/epigeopop.ipynb | 201 +++++++++++++++++++++++++++++++++++ 2 files changed, 213 insertions(+), 13 deletions(-) create mode 100644 walk_through/epigeopop.ipynb diff --git a/README.md b/README.md index 6003b72..3fcce6f 100644 --- a/README.md +++ b/README.md @@ -1,26 +1,25 @@ +ADD BADGES # rEpiabm rEpiabm enables users familiar with R to use Epiabm (ADD LINK). Epiabm is a simulation tool that models the progress of an epidemic across a specified region of interest within a specific timeframe. It has been developed in python for small-scale implementations and C++ for fast, large-scale simulations. PyEpiabm design is modular, with many options to configure specific requirements. ## Summary of Epiabm functionality ### Basic Architecture -To model an epidemic, contact events are represented by the population spatial structure. The transmission of the disease and its progression within host is represented by a compartment model. These two architectures are highly configurable; this allows us to study a wide range of simulation scenarios. +To model an epidemic, contact events are represented by the population spatial structure (see Figure 1.). The transmission of the disease and its progression within host is represented by a compartment model (see Figure 2.). These two architectures are highly configurable; this allows us to study a wide range of simulation scenarios.
-

Figure 1: Population Spatial Structure

- Population spatial structure -

The environment is modelled using EpiGeoPop, which takes a region of interest, creates layers of sub-regions of different types and populates these with individuals.

+ Population spatial structure +

Figure 1. Population Spatial Structure: The environment is modelled using EpiGeoPop, which takes a region of interest, creates layers of sub-regions of different types and populates these with individuals.

-

Figure 2: Infection Progression

- Infection progression -

The infection progression is represented using a compartment model which tracks the daily progress of the disease within an individual.

+ Infection progression +

Figure 2. Infection Progression: The infection progression is represented using a compartment model which tracks the daily progress of the disease within an individual.

## Running a simulation -The overview below describes the user-input needed to run a basic simulation, using default values for parameters for other options. There is a comprehensive jupyter notebook showing a detailed, more complex example here (ADD LINK). +The overview below describes how the simulation works and the user-input required to run a basic simulation. A more detailed, complex example is illustrated in this jupyter notebook (ADD LINK). Also, [the Wiki](https://github.com/SABS-R3-Epidemiology/epiabm/wiki/Overview-of-the-Ferguson-Model) details optional parameters available to the user as well as those whose values are mentioned, but changing them is not recommended. ### Step 1: Use EpiGeoPop to generate the population spatial structure As shown in Figure 1, the region of interest is broken into a spatial structure: @@ -29,9 +28,9 @@ As shown in Figure 1, the region of interest is broken into a spatial structure: * *Households* - quantity per microcell is based on a probabilistic distribution. All individuals are assigned to one household and do not move households during the simulation. * *Places* - quantity per microcell is based on a probabilistic distribution. These are spaces where individuals might meet other individuals from different households, a workplace or a public park for example. -This structure is created using [EpiGeoPop](https://github.com/SABS-R3-Epidemiology/EpiGeoPop). The user states a region of interest, Oxford or UK for example, and the tool creates a csv file as an output. This file contains one line per microcell for each cell, with the number of households, places and individuals to be used in the simulation. The quantity of individuals are extracted for the region using Census data. +This structure is created using [EpiGeoPop](https://github.com/SABS-R3-Epidemiology/EpiGeoPop). The user states a region of interest, Oxford or UK for example, and the tool extracts information from [Natural Earth](https://www.naturalearthdata.com/) and [JRC](https://data.jrc.ec.europa.eu/csv), providing a csv file as output. This file contains one line per microcell for each cell, with the number of households, places and individuals to be used in the simulation (the quantity of individuals are extracted from Census data). -In summary, at the end of this step, we have a spatial structure with details on the number of cells and microcells. Within each microcell, we know the number of households, places, and the number of individuals. This is exported as a csv file. +In summary, the spatial structure for a region is generated using EpiGeoPop. This tool exports into a csv file the number of households, places, and individuals for each microcell. ### Step 2: Configure the simulation The following parameters are essential and need to be stated by the user to run a simulation: @@ -42,7 +41,7 @@ The following parameters are essential and need to be stated by the user to run * Time for the simulation to run (in days) * Select any output options required -There are many further optional parameters which are described in detail here (ADD link to wiki) +There are many further optional parameters which are described in [the Wiki](https://github.com/SABS-R3-Epidemiology/epiabm/wiki/Overview-of-the-Ferguson-Model) **Common adjustments:** * At the start, infected individuals are distributed across all cells by default, you may want to put them in one cell. @@ -58,7 +57,7 @@ Once configured, the simulation takes the generated population and performs the * InitialisePlaceSweep - Assign individuals to places * InitialInfectedSweep - Assign which individuals are initially infected -There are optional modules such as recording demographics, which are described in detail here (ADD link to wiki). +There are optional modules such as recording demographics, which are described in [the Wiki](https://github.com/SABS-R3-Epidemiology/epiabm/wiki/Overview-of-the-Ferguson-Model) **Simulation sweeps:** Individual’s location and infection status is updated each day: @@ -75,5 +74,5 @@ A simulation produces one output file by default: Infection status (S, E, Imild, etc) for each day by cell -Further optional files are available, details described here (ADD link to wiki) or see jupyter notebook with a detailed illustration here (ADD LINK). These data files can be used to produce plots for further analysis. +Further optional files are available, details described in [the Wiki](https://github.com/SABS-R3-Epidemiology/epiabm/wiki/Overview-of-the-Ferguson-Model) or see jupyter notebook with a detailed illustration here . These data files can be used to produce plots for further analysis. diff --git a/walk_through/epigeopop.ipynb b/walk_through/epigeopop.ipynb new file mode 100644 index 0000000..7754bbc --- /dev/null +++ b/walk_through/epigeopop.ipynb @@ -0,0 +1,201 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Clone Github repository EpiGeoPop - I did it using VSCode" + ] + }, + { + "attachments": { + "image.png": { + "image/png": "" + } + }, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If you move the directory, the structure should be as below:\n", + "\n", + "![image.png](attachment:image.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Create a python 3.11 environment using the command below (it will not run on python 3.12):\n", + "On a mac:\n", + "/usr/local/opt/python@3.11/bin/python3.11 -m venv .venv" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Activate the environment:\n", + "source .venv/bin/activate" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Install the dependencies:\n", + "pip install -r requirements.txt" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "It may crash with an error ending:\n", + " :78: DeprecationWarning: The 'warn' function is deprecated, use 'warning' instead\n", + " WARNING:root:Failed to get options via gdal-config: [Errno 2] No such file or directory: 'gdal-config'\n", + " CRITICAL:root:A GDAL API version must be specified. Provide a path to gdal-config using a GDAL_CONFIG environment variable or use a GDAL_VERSION environment variable.\n", + " [end of output]\n", + "\n", + " note: This error originates from a subprocess, and is likely not a problem with pip.\n", + "error: subprocess-exited-with-error\n", + "× Getting requirements to build wheel did not run successfully.\n", + "│ exit code: 1\n", + "╰─> See above for output.\n", + "note: This error originates from a subprocess, and is likely not a problem with pip.\n", + "\n", + "\n", + "This is because Fiona, which is a Python package for reading and writing spatial data, requires GDAL (Geospatial Data Abstraction Library) to be installed on your system first.\n", + "\n", + "On a mac, you can install as follows:\n", + "brew install gdal\n", + "export GDAL_CONFIG=/usr/local/bin/gdal-config\n", + "export GDAL_VERSION=$(gdal-config --version)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To select a region of interest, go to configs/countries\n", + "Copy one of the parameter json files\n", + "Change line 16 to the country of your choice, for example:\n", + " \"country\": \"Andorra\",\n", + "\n", + "Save file" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next, the file \"prep.sh\" downloads data from two different websites. However, the websites have been updated so the following changes need to be made:\n", + "comment out the first 'curl' line\n", + "\n", + "curl -O https://jeodpp.jrc.ec.europa.eu/ftp/jrc-opendata/GHSL/GHS_POP_MT_GLOBE_R2019A/GHS_POP_E2015_GLOBE_R2019A_4326_30ss/V1-0/GHS_POP_E2015_GLOBE_R2019A_4326_30ss_V1_0.zip\n", + "and replace with \n", + "\n", + "curl -O https://jeodpp.jrc.ec.europa.eu/ftp/jrc-opendata/GHSL/GHS_POP_GLOBE_R2023A/GHS_POP_E2025_GLOBE_R2023A_4326_30ss/V1-0/GHS_POP_E2025_GLOBE_R2023A_4326_30ss_V1_0.zip\n", + "\n", + "Then copy the filename from the end of the path, GHS_POP_E2025_GLOBE_R2023A_4326_30ss_V1_0.zip, \n", + "and replace the filename after 'unzip' command:\n", + "\n", + "unzip GHS_POP_E2025_GLOBE_R2023A_4326_30ss_V1_0.zip\n", + "\n", + "Finally, comment out the remaining lines of code. We will need to download the files from the website directly.\n", + "\n", + "Save the file and run the file:\n", + "bash prep.sh" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Go here https://www.naturalearthdata.com/downloads/10m-cultural-vectors/\n", + "and click on the link:\n", + "Download without boundary lakes\n", + "https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_0_countries_lakes.zip\n", + "and save in the folder (which was created by prep.sh):\n", + "(data/raw)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "These datafiles will be used by Snakefile to create the csv file. However, the Snakefile needs to be amended as follows:\n", + "Open the Snakefile, amend row 8 to be the country of your choice (replace Luxembourg)\n", + "\n", + " \"data/processed/countries/Luxembourg_microcells.csv\",\n", + "\n", + " comment out row 9:\n", + " \"data/processed/countries/Luxembourg_pop_dist.json\",\n", + " \n", + " comment out row 19:\n", + " \"outputs/dag.pdf\"\n", + "\n", + " comment out the first rule:\n", + " rule render_dag:\n", + " input:\n", + " \"Snakefile\"\n", + " output:\n", + " \"outputs/dag.pdf\"\n", + " shell:\n", + " \"snakemake --dag | dot -Tpdf > outputs/dag.pdf\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Replace the 'tif' file references in row 31, 40 and 49 with the following:\n", + "\n", + "GHS_POP_E2025_GLOBE_R2023A_4326_30ss_V1_0.tif\n", + "\n", + "as this is the new file downloaded using prep.sh" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally, scroll to the bottom of Snakefile and comment out the following:\n", + "\n", + "rule make_pop_dist:\n", + " input:\n", + " \"data/raw/WPP2022_PopulationByAge5GroupSex_Medium.csv\",\n", + " \"configs/{region}/{place}_parameters.json\"\n", + " output:\n", + " \"data/processed/{region}/{place}_pop_dist.json\"\n", + " script:\n", + " \"scripts/get_pop_dist.py\"\n", + "\n", + "as this is not needed." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Run the tool using:\n", + "snakemake --cores 1" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The programm will show an error when it tries to make output.dag - please ignore.\n", + "\n", + "However, it should produce an output file in data/processed folder:\n" + ] + } + ], + "metadata": { + "language_info": { + "name": "python" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +}