Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DSCI-621 Implementing the functionality to publish to HDX #3

Draft
wants to merge 12 commits into
base: main
Choose a base branch
from
58 changes: 58 additions & 0 deletions .github/workflows/run-python-script.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# This workflow will install Python dependencies and run the script

name: Run script

on:
workflow_dispatch: # add run button in github
schedule:
- cron: "0 0 * * *"

jobs:
run:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v3
- name: Set up Python 3.x
uses: actions/setup-python@v4
with:
python-version: "3.x"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
- name: Run script
env:
HDX_SITE: ${{ secrets.HDX_SITE }}
HDX_KEY: ${{ secrets.HDX_BOT_SCRAPERS_API_TOKEN }}
PREPREFIX: ${{ secrets.PREPREFIX }}
USER_AGENT: ${{ secrets.USER_AGENT }}
BLOB_URL: ${{ secrets.BLOB_URL }}
STORAGE_ACCOUNT: ${{ secrets.STORAGE_ACCOUNT }}
CONTAINER: ${{ secrets.CONTAINER }}
BLOB: ${{ secrets.BLOB }}
KEY: ${{ secrets.KEY }}
AZURE_DB_PW_PROD: ${{ secrets.DB_PW }}
run: |
python run.py
- name: Commit updated data bundle
if: success()
uses: stefanzweifel/git-auto-commit-action@v4
with:
file_pattern: "dataset_dates.txt"
commit_message: automatic - Data bundle updated
push_options: "--force"
skip_dirty_check: false
- name: Send mail
if: failure()
uses: dawidd6/action-send-mail@v3
with:
server_address: ${{secrets.EMAIL_SERVER}}
server_port: ${{secrets.EMAIL_PORT}}
username: ${{secrets.EMAIL_USERNAME}}
password: ${{secrets.EMAIL_PASSWORD}}
subject: "FAILED: ${{github.repository}} run job"
body: GitHub Actions run job for ${{github.repository}} failed!
to: ${{secrets.EMAIL_LIST}}
from: ${{secrets.EMAIL_FROM}}
content_type: text/html
16 changes: 16 additions & 0 deletions config/hdx_dataset_static.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
license_id: "cc-by-igo"
methodology: "
Raster \n
\n- Daily FloodScan SFED(flood fraction) band obtained directly from AER.\n
\n- Daily historical SFED band smoothed with 11 day centered rolling mean (+/-5 days).\n
\n- Average day-of-year (DOY) raster calculated using last 10 years of smoothed data (2013-2023) to create SFED Baseline band\n
\n- SFED & SFED Baseline band merged per day.\n
\n- Last 90 days of merged SFED + SFED Baseline bands zipped\n
\n
Tabular \n
\n- Daily zonal stats (mean) for all historical SFED data 1998-01-01 - 2023-12-31 for each Admin 2 covered by FloodScan extent\n
\n- Return periods calculated form zonal mean data per admin 2 using the Log III Pearson distribution."
dataset_source: "AER FloodScan: Atmospheric and Environmental Research (AER)."
package_creator: "isabelle-tot"
private: False
caveats: "Modified and Simplified by UN OCHA"
32 changes: 32 additions & 0 deletions config/project_configuration.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Collector specific configuration
#TODO add details below to github secrets
url: "test"
account: "account"
container: "container"
key: "key"
baseline_filename: "floodscan/daily/v5/raw/baseline_v2025-01-01_v05r01.nc4"

dataset_names:
HDX-FLOODSCAN: "floodscan"

allowed_tags:
- "climate hazards"
- "climate-weather"
- "hydrology"
- "natural disasters"
- "flooding"

# Metadata
title: "FloodScan"
update_frequency: "daily"
maintainer_id: "2f9fd160-2a16-49c0-89d6-0bc3230599bf"
organization_id: "53acf7a0-29d5-453d-82ab-20daa6645128"
90days_filename: "aer_floodscan_300s_SFED_90d.zip"
stats_filename: "hdx_floodscan_zonal_stats.xlsx"
description_90days_file: "Daily zip file containing previous 90 days of raster data with both SFED and SFED baseline bands."
description_stats_file: "Daily Excel file containing date, admin metadata, raw SFED mean values (per admin 1 and 2), the approximate calculated return period and the baseline value calculated from the past 10 years of data."
notes: "FloodScan uses satellite data to map and monitor floods daily, helping compare current flood conditions with historical averages. This dataset contains two resources:
\n\n
The first (hdx_floodscan_zonal_stats.xlsx) is a daily tabular dataset providing average FloodScan Standard Flood Extent Depiction (SFED) flood fraction (0-100%) per admin 1 and 2 level. Historical baseline values (SFED_BASELINE) are calculated per day-of-year from the last 10 years of historical data (non-inclusive of current year) after applying an 11 day smoothing mean window. Return Period (RP) is calculated empirically based on all historical data up to the current year (non-inclusive).
\n\n
The second resource (aer_floodscan_300s_SFED_90d.zip) is a zipped file containing AER FloodScan estimated daily flood fraction (0-100%) gridded data at approximately 10 km resolution (300 arcseconds equivalent to approximately 0.083 degrees) for the last 90 days. Each file represents the estimates for a single day and includes 2 bands: SFED and SFED_BASELINE. The baseline band provides users an easy way to compare current values with historical averages. The baseline is calculated per day-of-year from the last 10 years of historical data (non-inclusive of current year) after applying an 11 day temporal smoothing mean window."
4 changes: 2 additions & 2 deletions exploration/02_wrangle_floodscan.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,12 +38,12 @@
import xarray as xr

from src.datasources import floodscan
from src.utils import gen_utils
from src.utils import date_utils

# %%
# just a quick utility function to get recent floodscan date that will
# will be in blob. Only doing a couple days just as proof of concept.
end_date = gen_utils.date_to_run()
end_date = date_utils.date_to_run()
start_date = end_date - datetime.timedelta(days=3)
da_current = floodscan.load_floodscan_cogs(
start_date=start_date, end_date=end_date
Expand Down
Loading