This repository contains code for the CoverCHILD data integration project. Patient data is queried from a FHIR server on-site and transformed into flat tables corresponding to FHIR resources for further anonymisation, processing, and analysis (e.g., the monitoring dashboard use case).
This script uses R (≥ 4.1.0), the following R packages and their respective dependencies:
- config (≥ 0.3.2)
- fhircrackr (≥ 2.1.1)
- tictoc (≥ 1.2)
- tidyverse (≥ 2.0.0)
Missing R packages are installed automatically from the R package repository (CRAN). If that is not wanted or possible, install packages manually prior to running the script. Please note that the script does try to install missing packages, but does not yet check whether versions of already installed packages are matching.
├── code/
│ ├── fhir_etl.R # main R script
│ └── functions.R # R helper functions
├── config/
│ ├── EXAMPLE_fhir_cfg.yml # general configuration TEMPLATE
│ ├── EXAMPLE_fhir_search_cfg.yml # FHIR search configuration TEMPLATE
│ ├── fhir_cfg.yml # general configuration, generated by running
│ │ # 'create_fresh_config.sh' or by copying and renaming
│ │ # 'EXAMPLE_fhir_cfg.yml' manually
│ └── fhir_search_cfg.yml # FHIR search configuration, generated by running
│ # 'create_fresh_config.sh' or by copying & renaming
│ # 'EXAMPLE_fhir_search_cfg.yml' manually
├── logs/ # log files (timings, http errors)
├── output/ # final output of the script
├── tmp/ # temporary files
│
├── CoverCHILD_FHIR_ETL.Rproj # RStudio project file for running the script interactively
├── create_fresh_config.sh # creates/resets configuration files by copying from TEMPLATES
└── run_fhir_etl.sh # runs the script (fhir_etl.R) while logging output
- Run '
create_fresh_config.sh
' to create the two necessary configuration files from the templates in the 'config/
' directory, or copy & rename them manually to 'fhir_cfg.yml
' and 'fhir_search_cfg.yml
' as shown in the folder structure - configure '
config/fhir_cfg.yml
': server settings and general behaviour of the script. - configure '
config/fhir_search_cfg.yml
': FHIR search parameters and resource element selection. This file only needs to be modified in special cases:- Make sure that all elements of the filter statements are present on the FHIR server and comment out not supported filter elements, for example if patients' addresses are censored.
- If the FHIR server supports a custom 'ServiceType' SearchParameter for Encounter resources, uncomment the respective statement to enable leaner queries.
For further information and instructions, see the documentation within the configuration files.
- in an interactive R session by opening the '
CoverCHILD_FHIR_ETL.Rproj
' R project and running the 'code/fhir_etl.R
' script
or by
- running '
run_fhir_etl.sh
'. Here, all output will be logged to folder specified for log files in 'config/fhir_cfg.yml
'
This script queries all Patient, Condition, Procedure, and Observation resources belonging to Encounters which:
- have admission dates between 2016-01-01 and 2022-03-31
- have contact to the pediatrics or child and adolescence psychiatry departments
- are under 18 years of age at admission
- have a German address
Filter criteria can be inspected and modified in 'config/fhir_search_cfg.yml
'.
If the script ran through successfully
- the last entry of the corresponding '
FHIR_timings_*.csv
' in the log directory is 'Run FHIR ETL.' - the output directory contains one file per resource i.e., Patient, Encounter, Condition, Procedure, Observation (if 'save_output' was not set to null in '
config/fhir_cfg.yml
') - no http error file was generated in the log directory
- will be filled accompanying the test phase
could not find function "..."
-> package version outdated
We're very interested in your experience running the script and would be happy to receive any feedback regarding comments, troubleshooting, questions, improvements, etc.
Especially useful to us is feedback on performance i.e.,
- the generated 'FHIR_timings_*.csv' log files, which do not contain sensitive data
- optimal/feasible batch size configuration on the used hardware
Please feel free to message us on github, open issues, or write a mail to [email protected].
Published under CC-BY-4.0.