-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
f5d0194
commit d3d0aba
Showing
2 changed files
with
15 additions
and
1 deletion.
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,15 @@ | ||
# RL-Epidemic-Benchmark | ||
# RL-Epidemic-Benchmark | ||
PyTorch implementation of the paper: | ||
|
||
- **Title**: *Planning Multiple Epidemic Interventions with Reinforcement Learning* | ||
- **ArXiv**: [coming soon] | ||
<!-- - **Authors**: --> | ||
<!-- - **Conference**: --> | ||
<!-- - More details: --> | ||
|
||
### Abstract | ||
|
||
Combating an epidemic entails finding a plan that describes when and how to apply different interventions, such as mask-wearing mandates, vaccinations, school or workplace closures. An optimal plan will curb an epidemic with minimal loss of life, disease burden, and economic cost. Finding an optimal plan is an intractable computational problem in realistic settings. Policy-makers, however, would greatly benefit from tools that can efficiently search for plans that minimize disease and economic costs especially when considering multiple possible interventions over a continuous and complex action space given a continuous and equally complex state space. We formulate this problem as a Markov decision process. Our formulation is unique in its ability to represent multiple continuous interventions over any disease model defined by ordinary differential equations. We illustrate how to effectively apply state-of-the-art actor-critic reinforcement learning algorithms (PPO and SAC) to search for plans that minimize overall costs. We empirically evaluate the learning performance of these algorithms and compare their performance to hand-crafted baselines that mimic plans constructed by policy-makers. Our method outperforms baselines. Our work confirms the viability of a computational approach to support policy-makers. | ||
|
||
data:image/s3,"s3://crabby-images/c673c/c673c7906f169fe5e1f0c95a7ac7fced6bc3a540" alt="The agent-environment interaction in the Markov decision process formalizing the epidemic planning problem." | ||
|