Skip to content

Commit

Permalink
feat(ci): Mirror NCBI datasets results to Hetzner object storage to r…
Browse files Browse the repository at this point in the history
…educe NIH load and get independence from external service (#3709)
  • Loading branch information
corneliusroemer authored Feb 18, 2025
1 parent 635199a commit 5a041d8
Showing 1 changed file with 42 additions and 0 deletions.
42 changes: 42 additions & 0 deletions .github/workflows/datasets-mirror.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
name: Mirror NCBI Datasets to Object Storage
on:
workflow_dispatch:
schedule:
- cron: '0 0 * * *' # Runs daily at midnight UTC
jobs:
download-and-upload:
runs-on: ubuntu-latest
strategy:
matrix:
taxon_id:
- 10244 # mpox
- 3048448 # West nile virus
- 3052460 # Orthoebolavirus sudanense
- 3052462 # Orthoebolavirus zairense
- 3052518 # CCHFV
fail-fast: false
steps:
- name: Checkout repository
uses: actions/checkout@v4
- uses: mamba-org/setup-micromamba@v1
with:
environment-name: datasets
create-args: ncbi-datasets-cli s3cmd
- name: Download NCBI Dataset
shell: bash -l {0}
run: |
datasets download virus genome taxon ${{ matrix.taxon_id }} --no-progressbar --filename ${{ matrix.taxon_id }}.zip
- name: Create S3cmd config
run: |
cat <<EOF > ~/.s3cfg
[default]
access_key = ${{ secrets.HETZNER_S3_ACCESS_KEY }}
secret_key = ${{ secrets.HETZNER_S3_SECRET_KEY }}
host_base = hel1.your-objectstorage.com
host_bucket = %(bucket)s.hel1.your-objectstorage.com
verbosity = DEBUG
EOF
- name: Upload to Object Storage
shell: bash -l {0}
run: |
s3cmd put ${{ matrix.taxon_id }}.zip s3://loculus-public/mirror/${{ matrix.taxon_id }}.zip

0 comments on commit 5a041d8

Please sign in to comment.