Skip to content

KJPMolgenLab/CoverCHILD_FHIR_ETL

Repository files navigation

CoverCHILD data integration FHIR ETL

last updated: 2023-11-30

Purpose

This repository contains code for the CoverCHILD data integration project. Patient data is queried from a FHIR server on-site and transformed into flat tables corresponding to FHIR resources for further anonymisation, processing, and analysis (e.g., the monitoring dashboard use case).

Dependencies

This script uses R (≥ 4.1.0), the following R packages and their respective dependencies:

Missing R packages are installed automatically from the R package repository (CRAN). If that is not wanted or possible, install packages manually prior to running the script. Please note that the script does try to install missing packages, but does not yet check whether versions of already installed packages are matching.

Folder structure

├── code/
│   ├── fhir_etl.R                   # main R script
│   └── functions.R                  # R helper functions
├── config/
│   ├── EXAMPLE_fhir_cfg.yml         # general configuration TEMPLATE
│   ├── EXAMPLE_fhir_search_cfg.yml  # FHIR search configuration TEMPLATE
│   ├── fhir_cfg.yml                 # general configuration, generated by running 
│   │                                # 'create_fresh_config.sh' or by copying and renaming 
│   │                                # 'EXAMPLE_fhir_cfg.yml' manually
│   └── fhir_search_cfg.yml          # FHIR search configuration, generated by running 
│                                    # 'create_fresh_config.sh' or by copying & renaming 
│                                    # 'EXAMPLE_fhir_search_cfg.yml' manually
├── logs/                            # log files (timings, http errors)
├── output/                          # final output of the script
├── tmp/                             # temporary files
│
├── CoverCHILD_FHIR_ETL.Rproj        # RStudio project file for running the script interactively
├── create_fresh_config.sh           # creates/resets configuration files by copying from TEMPLATES
└── run_fhir_etl.sh                  # runs the script (fhir_etl.R) while logging output

Steps for running the script

1) Configuration

  • Run 'create_fresh_config.sh' to create the two necessary configuration files from the templates in the 'config/' directory, or copy & rename them manually to 'fhir_cfg.yml' and 'fhir_search_cfg.yml' as shown in the folder structure
  • configure 'config/fhir_cfg.yml': server settings and general behaviour of the script.
  • configure 'config/fhir_search_cfg.yml': FHIR search parameters and resource element selection. This file only needs to be modified in special cases:
    • Make sure that all elements of the filter statements are present on the FHIR server and comment out not supported filter elements, for example if patients' addresses are censored.
    • If the FHIR server supports a custom 'ServiceType' SearchParameter for Encounter resources, uncomment the respective statement to enable leaner queries.

For further information and instructions, see the documentation within the configuration files.

2) Executing the script

  • in an interactive R session by opening the 'CoverCHILD_FHIR_ETL.Rproj' R project and running the 'code/fhir_etl.R' script

or by

  • running 'run_fhir_etl.sh'. Here, all output will be logged to folder specified for log files in 'config/fhir_cfg.yml'

Default filter criteria

This script queries all Patient, Condition, Procedure, and Observation resources belonging to Encounters which:

  • have admission dates between 2016-01-01 and 2022-03-31
  • have contact to the pediatrics or child and adolescence psychiatry departments
  • are under 18 years of age at admission
  • have a German address

Filter criteria can be inspected and modified in 'config/fhir_search_cfg.yml'.

Check for success / Troubleshooting

If the script ran through successfully

  • the last entry of the corresponding 'FHIR_timings_*.csv' in the log directory is 'Run FHIR ETL.'
  • the output directory contains one file per resource i.e., Patient, Encounter, Condition, Procedure, Observation (if 'save_output' was not set to null in 'config/fhir_cfg.yml')
  • no http error file was generated in the log directory

FAQ

  • will be filled accompanying the test phase
  • could not find function "..." -> package version outdated

Contact

We're very interested in your experience running the script and would be happy to receive any feedback regarding comments, troubleshooting, questions, improvements, etc.
Especially useful to us is feedback on performance i.e.,

  • the generated 'FHIR_timings_*.csv' log files, which do not contain sensitive data
  • optimal/feasible batch size configuration on the used hardware

Please feel free to message us on github, open issues, or write a mail to [email protected].


Published under CC-BY-4.0.

About

CoverCHILD on-site FHIR data integration

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published