Skip to content

Scrape provincial hospitalization data from nicd.ac.za #112

Scrape provincial hospitalization data from nicd.ac.za

Scrape provincial hospitalization data from nicd.ac.za #112

name: Scrape provincial hospitalization data from nicd.ac.za
# Unsure exactly when this data is released. Appears to be similar to the daily COVID stats,
# This scraper runs only once every night
# Adjusted to GMT+0 times
# This runner takes some time to setup.
# test this locally by using the "act" tool: https://github.com/nektos/act
# need to download the "full" version of the image, as R is not installed in the medium image.
# docker pull catthehacker/ubuntu:full-20.04
# act -j dailystats_scrape
# check the ~/.actrc file for mappings between ubuntu-20.04 and the full or medium images....
# regular schedule - every 10 mins, starting 2 over the hour; or also when the script changes, or when it is triggered manually.
on:
workflow_dispatch:
schedule:
- cron: '23 8,19 * * *'
push:
branches:
- master
paths:
- scripts/daily_nicd_datcov.R
jobs:
datcov_scrape:
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@master
- name: OS dependencies
run: |
# remove mongo - otherwise the config below fails
sudo apt-get remove mongodb-org-server -y
# some of the R packages below has issues automatically installing, and requires this explicitly
sudo apt-get -o Acquire::Check-Valid-Until=false -o Acquire::Check-Date=false update || echo "Done"
# for package level CRAN, we need this repo added
sudo add-apt-repository ppa:c2d4u.team/c2d4u4.0+ -y || echo "Done"
# some of the R packages below has issues automatically installing, and requires this explicitly
sudo apt-get install libcurl4-openssl-dev libmagick++-dev libpoppler-cpp-dev -y --fix-missing
# now install the R-cran based binary repos
sudo apt-get install r-cran-httr r-cran-xml2 r-cran-xml r-cran-data.table r-cran-git2r r-cran-magrittr r-cran-rcpp r-cran-magick r-cran-qpdf r-cran-pdftools r-cran-rappdirs r-cran-lubridate r-cran-zoo r-cran-reshape2 -y --fix-missing
- name: Scrape nicd.ac.za data
run: Rscript --vanilla 'scripts/daily_nicd_datcov.R'
- name: Push changes
run: |
git status --porcelain
git rev-list --left-right --count origin/master...master
# only push if there has been any changes
# if [[ "$(git rev-list --left-right --count origin/master...master)" != "0\t0" ]]; then
echo "We have new commits!"
# pull first, there might have been other commits since this job started.... Especially when more and more actions are added to the repo
git config pull.rebase false
git pull origin ${{ github.ref }}
# lets hope there were no merge conflicts....
git push origin ${{ github.ref }}
#fi
if: ${{ always() }}
env:
REPO_KEY: ${{secrets.GITHUB_TOKEN}}
username: krokkie