diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000..3ca7dfc --- /dev/null +++ b/LICENSE @@ -0,0 +1,21 @@ +MIT License + +Copyright (c) 2022 Consortium of Infectious Disease Modeling Hubs + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. diff --git a/README.md b/README.md new file mode 100644 index 0000000..f0c8d28 --- /dev/null +++ b/README.md @@ -0,0 +1,27 @@ +**Below is a template of the README.md file for your hub. Italics in brackets are placeholders for information about your hub. ** + + +# *[Insert title of hub]* +*[Describe overall purpose of hub.]* Anyone interested in using these data for additional research or publications is requested to contact *[insert email]* for information regarding attribution of the source forecasts. + +## *[Insert title of forecasts]* +*[Describe forecasts ]* + +**Dates:** The Challenge Period will begin *[insert start date]* and will run until *[insert start date]*. Participants are currently asked to submit *[insert description of forecasts]* by *[insert timing]* .(herein referred to as the Forecast Due Date). In the event that timelines of data availability change, *[insert name of hub]* may change the day of week that forecasts are due. In this case, participants would be notified at least one week in advance. *[insert temporal period]* submissions (including file names) will be specified in terms of the reference date, which is the Saturday following the Forecast Due Date. The reference date is the last day of the epidemiological week (EW) (Sunday to Saturday) containing the Forecast Due Date. + +**Prediction Targets:** +Participating teams are asked to provide *[insert geographical requirements]* predictions for targets *[insert name of target(s)]*. + +Teams will submit *[insert description of forecasts]* for the epidemiological week (EW) ending on the reference date as well as *[insert horizons]*. Teams can but are not required to submit forecasts for all *[insert temporal period]* horizons or for all locations. The evaluation data for forecasts will be the *[insert temporal period]* aggregate of *[insert description of evaluation data]* We will use the specification of EWs defined by the [CDC](https://wwwn.cdc.gov/nndss/document/MMWR_Week_overview.pdf), which run Sunday through Saturday. The target end date for a prediction is the Saturday that ends an EW of interest, and can be calculated using the expression: +**target end date = reference date + horizon * (*[insert # days in temporal period]* days)**. + +There are standard software packages to convert from dates to epidemic weeks and vice versa (e.g. [MMWRweek](https://cran.r-project.org/web/packages/MMWRweek/) and [lubridate](https://lubridate.tidyverse.org/reference/week.html) for R and [pymmwr](https://pypi.org/project/pymmwr/) and [epiweeks](https://pypi.org/project/epiweeks/) for Python). + + +If you have questions aboutthis target, please reach out to *[insert name]* (*[insert email]* ). + +## Acknowledgments +This repository follows the guidelines and standards outlined by the [hubverse]([url](https://hubverse.io/en/latest/)), which provides a set of data formats and open source tools for modeling hubs. + + +**As an example, here is the link to [Flusight-Forecast_Hub README](https://github.com/cdcepi/FluSight-forecast-hub/blob/master/README.md). ** diff --git a/hub-config/README.md b/hub-config/README.md new file mode 100644 index 0000000..bdaef89 --- /dev/null +++ b/hub-config/README.md @@ -0,0 +1,3 @@ +# Hub configuation files + +This folder should contain configuration files for the Hub, following the recommended [Hub configuration files in our documentation](https://hubverse.io/en/latest/user-guide/hub-config.html). diff --git a/hub-config/admin.json b/hub-config/admin.json new file mode 100644 index 0000000..81d334f --- /dev/null +++ b/hub-config/admin.json @@ -0,0 +1,16 @@ +{ + "schema_version": "https://raw.githubusercontent.com/hubverse-org/schemas/main/v5.0.0/admin-schema.json", + "name": "Template Forecast Hub", + "maintainer": "Consortium of Infectious Disease Modeling Hubs", + "contact": { + "name": "Joe Bloggs", + "email": "j.bloggs@cidmh.com" + }, + "repository": { + "host": "github", + "owner": "hubverse-org", + "name": "hubTemplate" + }, + "file_format": ["csv", "parquet"], + "timezone": "US/Eastern" +} diff --git a/hub-config/model-metadata-schema.json b/hub-config/model-metadata-schema.json new file mode 100644 index 0000000..f5ccb3b --- /dev/null +++ b/hub-config/model-metadata-schema.json @@ -0,0 +1,130 @@ +{ + "$schema": "https://json-schema.org/draft/2020-12/schema", + "title": "Schema for Modeling Hub model metadata", + "description": "This is the schema for model metadata files, please refer to https://github.com/covid19-forecast-hub-europe/covid19-forecast-hub-europe/wiki/Metadata for more information.", + "type": "object", + "properties": { + "team_name": { + "description": "The name of the team submitting the model", + "type": "string" + }, + "team_abbr": { + "description": "Abbreviated name of the team submitting the model", + "type": "string", + "pattern": "^[a-zA-Z0-9_+]+$", + "maxLength": 16 + }, + "model_name": { + "description": "The name of the model", + "type": "string" + }, + "model_abbr": { + "description": "Abbreviated name of the model", + "type": "string", + "pattern": "^[a-zA-Z0-9_+]+$", + "maxLength": 16 + }, + "model_version": { + "description": "Identifier of the version of the model", + "type": "string" + }, + "model_contributors": { + "type": "array", + "items": { + "type": "object", + "properties": { + "name": { + "type": "string" + }, + "affiliation": { + "type": "string" + }, + "orcid": { + "type": "string", + "pattern": "^\\d{4}\\-\\d{4}\\-\\d{4}\\-[\\dX]{4}$" + }, + "email": { + "type": "string", + "format": "email" + }, + "twitter": { + "type": "string" + }, + "additionalProperties": false + } + } + }, + "website_url": { + "description": "Public facing website for the model", + "type": "string", + "format": "uri" + }, + "repo_url": { + "description": "Repository containing code for the model", + "type": "string", + "format": "uri" + }, + "license": { + "description": "License for use of model output data", + "type": "string", + "enum": [ + "CC0-1.0", + "CC-BY-4.0", + "CC-BY_SA-4.0", + "PPDL", + "ODC-by", + "ODbL", + "OGL-3.0" + ] + }, + "citation": { + "description": "One or more citations for this model", + "type": "string", + "examples": [ + "Gibson GC , Reich NG , Sheldon D. Real-time mechanistic bayesian forecasts of Covid-19 mortality. medRxiv. 2020. https://doi.org/10.1101/2020.12.22.20248736" + ] + }, + "team_funding": { + "description": "Any information about funding source for the team or members of the team.", + "type": "string", + "examples": [ + "National Institutes of General Medical Sciences (R01GM123456). The content is solely the responsibility of the authors and does not necessarily represent the official views of NIGMS." + ] + }, + "model_details": { + "description": "Structured information about the model", + "type": "object", + "properties": { + "data_inputs": { + "description": "List or description of data inputs used by the model", + "type": "string" + }, + "methods": { + "description": "A brief (200 char.) description of the methods used by this model", + "type": "string", + "maxLength": 200 + }, + "methods_long": { + "description": "A full description of the methods used by this model.", + "type": "string" + } + }, + "additionalProperties": false, + "required": ["data_inputs", "methods"] + }, + "ensemble_of_hub_models": { + "description": "Indicator for whether this model is an ensemble of other Hub models", + "type": "boolean" + } + }, + "additionalProperties": true, + "required": [ + "team_name", + "team_abbr", + "model_name", + "model_abbr", + "model_contributors", + "license", + "model_details" + ] +} diff --git a/hub-config/tasks.json b/hub-config/tasks.json new file mode 100644 index 0000000..19b7d96 --- /dev/null +++ b/hub-config/tasks.json @@ -0,0 +1,27 @@ +{ + "schema_version": "https://raw.githubusercontent.com/hubverse-org/schemas/main/v5.0.0/tasks-schema.json", + "rounds": [ + { + "round_id_from_variable": true, + "round_id": "task_id_variable_that_defines_a_round", + "model_tasks": [ + { + "task_ids": { + "task_id_variable_that_defines_a_round": { + "required": [], + "optional": [] + } + }, + "output_type": {}, + "target_metadata":[] + } + ], + "submissions_due": { + "start": "1900-01-01", + "end": "2900-12-31" + } + } + ], + "output_type_id_datatype": "auto", + "derived_task_ids": null +} diff --git a/model-metadata/README.md b/model-metadata/README.md new file mode 100644 index 0000000..acb2c25 --- /dev/null +++ b/model-metadata/README.md @@ -0,0 +1,148 @@ +# Model metadata + + +**Below is a template of the README.md file for the model-metadata folder of your hub. Italics in brackets are placeholders for information about your hub. ** + + +This folder contains metadata files for the models submitting to the *[hub name]*. The specification for these files has been adapted to be consistent with [model metadata guidelines in the hubverse documentation](https://hubverse.io/en/latest/user-guide/model-metadata.html). + +Each model is required to have metadata in +[yaml format](https://docs.ansible.com/ansible/latest/reference_appendices/YAMLSyntax.html). + +These instructions provide detail about the [data +format](#Data-format) as well as [validation](#Data-validation) that +you can do prior to a pull request with a metadata file. + +# Data format + +## Required variables + +This section describes each of the variables (keys) in the yaml document. +Please order the variables in this order. + +### team_name +The name of your team that is less than 50 characters. + +### team_abbr +The name of your team that is less than 16 characters. + +### model_name +The name of your model that is less than 50 characters. + +### model_abbr +An abbreviated name for your model that is less than 16 alphanumeric characters. + +### model_contributors + +A list of all individuals involved in the forecasting effort. +A names, affiliations, and email address is required for each contributor. Individuals may also include an optional orcid identifiers. +All email addresses provided will be added to an email distribution list for model contributors. + +The syntax of this field should be +``` +model_contributors: [ + { + "name": "Modeler Name 1", + "affiliation": "Institution Name 1", + "email": "modeler1@example.com", + "orcid": "1234-1234-1234-1234" + }, + { + "name": "Modeler Name 2", + "affiliation": "Institution Name 2", + "email": "modeler2@example.com", + "orcid": "1234-1234-1234-1234" + } +] +``` + +### license + +One of the [accepted licenses](https://github.com/cdcepi/FluSight-forecast-hub/blob/673e983fee54f3a21448071ac46a9f78d27dd164/hub-config/model-metadata-schema.json#L69-L75). + +We encourage teams to submit as a "cc-by-4.0" to allow the broadest possible uses +including private vaccine production (which would be excluded by the "cc-by-nc-4.0" license). + +### designated_model + +A team-specified boolean indicator (`true` or `false`) for whether the model should be considered eligible for inclusion in a Hub ensemble and public visualization. A team may specify up to two models as a designated_model for inclusion. Models which have a designated_model value of 'False' will still be included in internal forecasting hub evaluations. + +### data_inputs + +List or description of the data sources used to inform the model. Particularly those used beyond the target data of confirmed influenza hospital admissions. + +### methods + +A brief description of your forecasting methodology that is less than 200 +characters. + +### methods_long + +A full description of the methods used by this model. Among other details, this should include whether spatial correlation is considered and how the model accounts for uncertainty. If the model is modified, this field can also be used to provide the date of the modification and a description of the change. + +### ensemble_of_models + +A boolean value (`true` or `false`) that indicates whether a model is an ensemble of any separate component models. + +### ensemble_of_hub_models + +A boolean value (`true` or `false`) that indicates whether a model is an ensemble specifically of other models submited to the FluSight forecasting hub. + +## Optional + +### model_version +An identifier of the version of the model + +### website_url + +A url to a website that has additional data about your model. +We encourage teams to submit the most user-friendly version of your +model, e.g. a dashboard, or similar, that displays your model forecasts. + +### repo_url + +A github (or similar) repository url containing code for the model. + +### citation + +One or more citations to manuscripts or preprints with additional model details. For example, "Gibson GC , Reich NG , Sheldon D. Real-time mechanistic bayesian forecasts of Covid-19 mortality. medRxiv. 2020. https://doi.org/10.1101/2020.12.22.20248736". + +### team_funding + +Any information about funding source(s) for the team or members of the team that would be natural to include on any resulting FluSight publications. For example, "National Institutes of General Medical Sciences (R01GM123456). The content is solely the responsibility of the authors and does not necessarily represent the official views of NIGMS." + +# Data validation + +Optionally, you may validate a model metadata file locally before submitting it to the hub in a pull request. Note that this is not required, since the validations will also run on the pull request. To run the validations locally, follow these steps: + +1. Create a fork of the `*[insert hub name]*` repository and then clone the fork to your computer. +2. Create a draft of the model metadata file for your model and place it in the `model-metadata` folder of this clone. +3. Install the hubValidations package for R by running the following command from within an R session: +``` r +remotes::install_github("hubverse-org/hubValidations") +``` +4. Validate your draft metadata file by running the following command in an R session: +``` r +hubValidations::validate_model_metadata( + hub_path="", + file_path="") +``` + +For example, if your working directory is the root of the hub repository, you can use a command similar to the following: +``` r +hubValidations::validate_model_metadata(hub_path=".", file_path="UMass-trends_ensemble.yml") +``` + +If all is well, you should see output similar to the following: +``` +✔ model-metadata-schema.json: File exists at path hub-config/model-metadata-schema.json. +✔ UMass-trends_ensemble.yml: File exists at path model-metadata/UMass-trends_ensemble.yml. +✔ UMass-trends_ensemble.yml: Metadata file extension is "yml" or "yaml". +✔ UMass-trends_ensemble.yml: Metadata file directory name matches "model-metadata". +✔ UMass-trends_ensemble.yml: Metadata file contents are consistent with schema specifications. +✔ UMass-trends_ensemble.yml: Metadata file name matches the `model_id` specified within the metadata file. +``` + +If there are any errors, you will see a message describing the problem. + +**As an example, here is a link to the [Flusight-Forecast_Hub model-metadata README](https://github.com/cdcepi/FluSight-forecast-hub/blob/master/model-metadata/README.md).** diff --git a/model-output/README.md b/model-output/README.md new file mode 100644 index 0000000..44cae81 --- /dev/null +++ b/model-output/README.md @@ -0,0 +1,246 @@ +# Model outputs folder + +**Below is a template of the README.md file for the model-ouput folder of your hub. Italics in brackets are placeholders for information about your hub. ** + +This folder contains a set of subdirectories, one for each model, that contains submitted model output files for that model. The structure of these directories and their contents follows [the model output guidelines in our documentation](https://hubverse.io/en/latest/user-guide/model-output.html). Documentation for hub submissions specifically is provided below. + +# Data submission instructions + +All forecasts should be submitted directly to the [model-output/](./) +folder. Data in this directory should be added to the repository through +a pull request so that automatic data validation checks are run. + +These instructions provide detail about the [data +format](#Data-formatting) as well as [validation](#Forecast-validation) that +you can do prior to this pull request. In addition, we describe +[metadata](https://github.com/hubverse-org/hubTemplate/blob/master/model-metadata/README.md) +that each model should provide in the model-metadata folder. + +*Table of Contents* + +- [What is a forecast](#What-is-a-forecast) +- [Target data](#Target-data) +- [Data formatting](#Data-formatting) +- [Forecast file format](#Forecast-file-format) +- [Forecast data validation](#Forecast-validation) +- [Weekly ensemble build](#Weekly-ensemble-build) +- [Policy on late submissions](#policy-on-late-or-updated-submissions) + +## What is a forecast + +Models are asked to make specific quantitative forecasts about data that +will be observed in the future. These forecasts are interpreted as +"unconditional" predictions about the future. That is, they are not +predictions only for a limited set of possible future scenarios in which +a certain set of conditions (e.g. vaccination uptake is strong, or new +social-distancing mandates are put in place) hold about the future -- +rather, they should characterize uncertainty across all reasonable +future scenarios. In practice, all forecasting models make some +assumptions about how current trends in data may change and impact the +forecasted outcome; some teams select a "most likely" scenario or +combine predictions across multiple scenarios that may occur. Forecasts +submitted to this repository will be evaluated against observed data. + +We note that other modeling efforts, such as the [Influenza Scenario +Modeling Hub](https://fluscenariomodelinghub.org/), have been +launched to collect and aggregate model outputs from "scenario +projection" models. These models create longer-term projections under a +specific set of assumptions about how the main drivers of the pandemic +(such as non-pharmaceutical intervention compliance, or vaccination +uptake) may change over time. + +## Target Data + +*[insert description target data]* + + +## Data formatting + +The automatic checks in place for forecast files submitted to this +repository validates both the filename and file contents to ensure the +file can be used in the visualization and ensemble forecasting. + +### Subdirectory + +Each model that submits forecasts for this project will have a unique subdirectory within the [model-output/](model-output/) directory in this GitHub repository where forecasts will be submitted. Each subdirectory must be named + + team-model + +where + +- `team` is the team name and +- `model` is the name of your model. + +Both team and model should be less than 15 characters and not include +hyphens or other special characters, with the exception of "\_". + +The combination of `team` and `model` should be unique from any other model in the project. + + +### Metadata + +The metadata file will be saved within the model-metdata directory in the Hub's GitHub repository, and should have the following naming convention: + + + team-model.yml + +Details on the content and formatting of metadata files are provided in the [model-metadata README](https://github.com/hubverse-org/hubTemplate/blob/master/model-metadata/README.md). + + + + +### Forecasts + +Each forecast file should have the following +format + + YYYY-MM-DD-team-model.csv + +where + +- `YYYY` is the 4 digit year, +- `MM` is the 2 digit month, +- `DD` is the 2 digit day, +- `team` is the team name, and +- `model` is the name of your model. + +The date YYYY-MM-DD is the [`reference_date`](#reference_date). This should be the Saturday following the submission date. + +The `team` and `model` in this file must match the `team` and `model` in +the directory this file is in. Both `team` and `model` should be less +than 15 characters, alpha-numeric and underscores only, with no spaces +or hyphens. + +## Forecast file format + +The file must be a comma-separated value (csv) file with the following +columns (in any order): + +- `reference_date` +- `target` +- `horizon` +- `target_end_date` +- `location` +- `output_type` +- `output_type_id` +- `value` + +No additional columns are allowed. + +The value in each row of the file is a quantile for a particular combination of location, date, and horizon. + +### `reference_date` + +Values in the `reference_date` column must be a date in the ISO format + + YYYY-MM-DD + +This is the date from which all forecasts should be considered. This date is the Saturday following the submission Due Date, corresponding to the last day of the epiweek when submissions are made. The `reference_date` should be the same as the date in the filename but is included here to facilitate validation and analysis. + +### `target` + +Values in the `target` column must be a character (string) and be one of +the following specific targets: + +- *`[insert target] `* +- *`[insert target] `* + + +### `horizon` +Values in the `horizon` column indicate the number of *[insert temporal period]* between the `reference_date` and the `target_end_date`. This should be a number between *[insert range of horizons]* , where for example a `horizon` of 0 indicates that the prediction is a nowcast for the *[insert temporal period]* of submission and a `horizon` of 1 indicates that the prediction is a forecast for the *[insert temporal period]* after submission. + +### `target_end_date` + +Values in the `target_end_date` column must be a date in the format + + YYYY-MM-DD + +This is the last date of the forecast target's *[insert temporal period]*. This will be the date of the Saturday at the end of the forecasted *[insert temporal period]* Within each row of the submission file, the `target_end_date` should be equal to the `reference_date` + `horizon`* (*[# days in temporal period]* days). + + +### `location` + +Values in the `location` column must be one of the "locations" in +this *[insert name of location information file]* (*[insert url for location information file]* ) which +includes *[describe what location information files included]* + +### `output_type` + +**Modify depending on which type of output_type you are collecting.** + +Values in the `output_type` column are either + +- "quantile" or +- "pmf". + +This value indicates whether that row corresponds to a quantile forecast for *[insert target]* or the probability mass function (pmf) of a categorical forecast for *[insert target]*. + +### `output_type_id` +Values in the `output_type_id` column specify identifying information for the output type. + +#### quantile output +**Modify depending on which type of output_type you are collecting.** + +When the predictions are quantiles, values in the `output_type_id` column are a quantile probability level in the format + + 0.### + + This value indicates the quantile probability level for for the +`value` in this row. + +Teams must provide the following *[insert # quantiles]* quantiles: + +*[insert quantiles]* + +R: *[insert r code for defining quantiles]* +Python: *[insert Python code for defining quantiles]* + +#### pmf output +**Modify depending on which type of output_type you are collecting.** + + *[Describe pmf output]* + + +### `value` +**Modify depending on which type of output_type you are collecting.** + +Values in the `value` column are non-negative numbers indicating the "quantile" or "pmf" prediction for this row. For a "quantile" prediction, `value` is the inverse of the cumulative distribution function (CDF) for the target, location, and quantile associated with that row. For example, the 2.5 and 97.5 quantiles for a given target and location should capture 95% of the predicted values and correspond to the central 95% Prediction Interval. + +### Example tables + + *[Insert example tables]* + +## Forecast validation + +To ensure proper data formatting, pull requests for new data in +`model-output/` will be automatically run. Optionally, you may also run these validations locally. + +### Pull request forecast validation + +When a pull request is submitted, the data are validated through [Github +Actions](https://docs.github.com/en/actions) which runs the tests +present in [the hubValidations +package](https://github.com/hubverse-org/hubValidations). The +intent for these tests are to validate the requirements above. Please +[let us know]( *[Insert url to your hub's issues]*) if you are facing issues while running the tests. + +### Local forecast validation + +Optionally, you may validate a forecast file locally before submitting it to the hub in a pull request. Note that this is not required, since the validations will also run on the pull request. To run the validations locally, follow these steps: + + *[Add description for local forecast validation]* + + +## Weekly ensemble build + +Every *[day and time]*, we will generate a *[hub name]* ensemble *[Insert target]* using valid forecast submissions in the current week by the *[day and time]* deadline. Some or all participant forecasts may be combined into an ensemble forecast to be published in real-time along with the participant forecasts. In addition, some or all forecasts may be displayed alongside the output of a baseline model for comparison. + + +## Policy on late or updated submissions + +In order to ensure that forecasting is done in real-time, all forecasts are required to be submitted to this repository by *[day and time]* each week. We do not accept late forecasts. + +## Evaluation criteria +Forecasts will be evaluated using a variety of metrics, including *[describe how they will be evaluated]* + +**As an example, here is a link to the [Flusight-Forecast_Hub model-output README](https://github.com/cdcepi/FluSight-forecast-hub/blob/master/model-output/README.md).**