Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: 📝 expand on inclusions and exclusions #133

Merged
merged 33 commits into from
Dec 19, 2024
Merged
Changes from 1 commit
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
90c7492
Fleshed out and updated include_gld_purchases() flow documentation
Sep 18, 2024
0504532
Added description of podiatrist services function flow
Sep 18, 2024
7777f36
Reformated some GLD text, added HbA1c and started on pregnancy dates
Sep 18, 2024
6382624
Reworded include_hba1c section
Sep 19, 2024
4fd5903
Added lpr-joins, started on describing lpr processing
Sep 19, 2024
29fea86
Finished LPR/diagnosis part of function flow
Sep 19, 2024
f9d7661
fixed a new things to describe LPR3 processing
Sep 19, 2024
9a05d81
specified that only primary diagnoses go into type classification
Sep 19, 2024
f03a4da
Update vignettes/function-flow.Rmd
Aastedet Sep 19, 2024
bc889d4
switched the order of inclusion sections and mentioned that some of t…
Sep 19, 2024
8d60bd0
Merge branch 'update-function-flow' of https://github.com/steno-aarhu…
Sep 19, 2024
9a74ea0
Merge branch 'main' into update-function-flow
Aastedet Sep 20, 2024
7525b60
fixed spec to speciale variable name
Sep 20, 2024
25db86d
Merge branch 'update-function-flow' of https://github.com/steno-aarhu…
Sep 20, 2024
092824e
Removed "name" or "vnr" variables from GLD function flow.
Sep 20, 2024
20f5886
Updates join_lpr function description to filter to necessary diagnoses.
Sep 20, 2024
61b5d27
Removed section on weightloss drugs, since we're no longer including …
Sep 20, 2024
7b9738d
Update vignettes/function-flow.Rmd
Aastedet Sep 20, 2024
3a95d4f
Added description of exclude_potential_pcos()
Sep 20, 2024
f663844
Merge branch 'update-function-flow' of https://github.com/steno-aarhu…
Sep 20, 2024
fe257a6
Renamed some variables.
Sep 20, 2024
4bba18e
Added censoring/exclusion function description
Sep 20, 2024
7cca920
Added correct diagnoses to filter to in lpr_join() functions.
Sep 27, 2024
b40412c
changed specialty values to align with the PR with a refactored creat…
Sep 27, 2024
35118e8
Joining inclusions and definition. Looking to add type classification.
Dec 16, 2024
cbee21f
Removed helper function for dropping first event as it seemed a bit e…
Dec 17, 2024
3820459
Added function flow description of get_diabetes_type() and its helper…
Dec 17, 2024
d294071
docs: :pencil2: small edits from review
lwjohnst86 Dec 18, 2024
159ab16
Merge branch 'main' of https://github.com/steno-aarhus/osdc into upda…
lwjohnst86 Dec 18, 2024
b94e36d
Merge branch 'update-function-flow' of https://github.com/steno-aarhu…
lwjohnst86 Dec 18, 2024
da55507
Update vignettes/function-flow.Rmd
Aastedet Dec 19, 2024
8cc3cee
Merge branch 'main' of https://github.com/steno-aarhus/osdc into upda…
lwjohnst86 Dec 19, 2024
822da79
Merge branch 'update-function-flow' of https://github.com/steno-aarhu…
lwjohnst86 Dec 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Reformated some GLD text, added HbA1c and started on pregnancy dates
Anders Aasted Isaksen authored and Anders Aasted Isaksen committed Sep 18, 2024

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
commit 7777f367bc53d521282777e0bec41e4e8c6a0817
62 changes: 41 additions & 21 deletions vignettes/function-flow.Rmd
Original file line number Diff line number Diff line change
@@ -95,9 +95,16 @@ outputs.](images/function-flow-population.png)
#### HbA1c tests above 48 mmol/mol

The function `include_hba1c()` uses `lab_forsker` as the input data to
extract all events of tests above 48 mmol/mol.
extract the dates of all elevated HbA1c test results: $\geq$ 48 mmol/mol
(or $\geq$ 6.5% in DCCT units). To support DCCT units, the function
converts the value of these to IFCC units internally before including
all rows with `value` $\geq$ 48 and deduplicating multiple elevated
results on the same day within each individual.

<!-- TODO: Add details on how this filtering should be done -->
`include_hba1c()` passes a 3-column data frame containing the identifier
variable (`pnr`) and the dates of all elevated HbA1c test results. This
is passed to the `exclude_pregnancy()` function for censoring of
elevated results due to potential gestational diabetes (see below).

#### Hospital diagnosis of diabetes

@@ -125,26 +132,40 @@ These dates are extracted by filtering values beginning with "54" in the
instead, if that is the data available to the user). In addition,
services provided to a child of the individual (`barnmak` != 0) are
excluded using the `barnmak` variable. An internal helper function
`get_unique_honuge_dates()` is applied to generate a date variable
(`regdate`) based on the year-week (wwyy-formatted) variable (`honuge`)
in the raw data, and de-duplicates multiple services registered on the
same date. Ultimately, `include_podiatrist_services()` outputs only the
identifier variable (`pnr`) and date of the service (`regdate`) to the
`get_diagnosis_date()` function for the final step of the inclusion
process.
`get_unique_honuge_dates()` is applied to generate a proper date
variable based on the year-week (wwyy-formatted) variable (`honuge`)
found in the raw data, and de-duplicates multiple services registered on
the same date.

`include_podiatrist_services()` outputs a 3-column data frame containing
the identifier variable (`pnr`) and the date of the two earliest records
of diabetes-specific podiatrist services for each individual. This is
passed to the `get_diagnosis_date()` function for the final step of the
inclusion process.

#### GLD purchases

The function `include_gld_purchases()` uses `lmdb` to extract the dates
of all GLD purchases.

These dates are extracted by filtering values beginning with "A10" in
the `atc` variable of the `lmdb` register. In addition to the identifier
variable (`pnr`) and date (`eksd`), additional information needed for
censoring or for classification of diabetes type are also extracted: the
type of drug (`atc`), the amount purchased (`volume` and `apk`), the
indication code (`indo`), and its brand name or vnr-number (`name` or
`vnr`). These events are then passed to a chain of exclusion functions:
the `atc` variable of the `lmdb` register. Since the diagnosis code data
on pregnancies (see below) is insufficient to perform censoring prior to
1997, `include_gld_purchases()` only extracts dates from 1997 onward by
default (if Medical Birth Register data is available to use for
censoring, the extraction window can be extended).

This function outputs a `data.frame` with the following variables needed
later in the classification part of the function flow:

- identifier variable (`pnr`)
- date (`eksd`)
- type of drug (`atc`)
- amount purchased (`volume` and `apk`)
- indication code (`indo`)
- brand name or vnr-number (`name` or `vnr`)

These events are then passed to a chain of exclusion functions:
`exclude_wld_purchases()`, `exclude_potential_pcos()`,
`exclude_pregnancy()` described in the sections below.

@@ -158,11 +179,6 @@ inputs to two sets of functions:
`get_insulin_is_two_thirds_of_gld_doses()` helper functions for the
classification of diabetes type.

Since the diagnosis code data on pregnancies is insufficient to perform
censoring prior to 1997, `include_gld_purchases()` only extracts dates
from 1997 onward by default (if Medical Birth Register data is available
to use for censoring, the extraction window can be extended).

### Exclusion events

#### HbA1c tests and GLD purchases during pregnancy
@@ -173,7 +189,11 @@ pregnancy, as these may be due to gestational diabetes, rather than type
1 or type 2 diabetes.

Internally, this relies on the function `get_pregnancy_dates()` that
contains the following three helper functions:
uses diagnoses registered in the National Patient Register to extract
the dates of all pregnancy ending (live births or miscarriages). These
are identified by filtering values beginning with "DO0[0-6]", "DO8[0-4]"
or "DZ3[37]" in the `c_diag` variable in the LPR2 data (`diagnosekode`
in LPR3 data).

<!-- TODO: Add details on how this filtering should be done -->