-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor visit and cost models, plus other small changes #111
base: main
Are you sure you want to change the base?
Conversation
@@ -1,33 +1,15 @@ | |||
/* emergency visits */ | |||
/* collapse er claim lines with no days between them into one visit */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i chose to remove this bit of logic as it complicates the modeling, and i'm not sure we should implement this as a blanket assumption (i.e., it's plausible that someone has separate visits to the ER or urgent care 2 days in a row)
FROM {{ ref( 'stg_synthea__encounters') }} | ||
WHERE | ||
encounter_class = 'inpatient' | ||
OR (encounter_class IN ('ambulatory', 'wellness', 'outpatient', 'emergency', 'urgentcare') AND encounter_start_date != encounter_stop_date) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any visit not labeled inpatient but that lasts more than 1 day is considered an inpatient visit. i've seen this done in other ETLs.
i could see the argument that multiday OP visits are actually bad data, though, so i was on the fence about making this assumption. we could also just keep them as OP visits and let the analyst figure it out. it probably depends on the data source. (here, since it's fake data, there is prob no right answer).
, min(e.end_datetime) AS visit_end_datetime | ||
FROM {{ ref( 'stg_synthea__encounters') }} AS v | ||
min(a.encounter_id) OVER (PARTITION BY a.patient_id, a.encounter_start_date) AS encounter_id | ||
, a.encounter_id AS original_encounter_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
most of the code in this model is unchanged (rows just got shifted around). this line is an exception. i'm retaining the original encounter ID so we can map all the rolled up encounters during an IP stay back to the visit occurrence for that stay.
, {{ dbt.cast("null", api.Column.translate_type("integer")) }} AS country_concept_id | ||
, {{ dbt.cast("null", api.Column.translate_type("varchar")) }} AS country_source_value | ||
, p.patient_latitude AS latitude | ||
, p.patient_longitude AS longitude |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran into fanning issues related to the fact that we were including lat/long in the location model, but that location_source_value (our join key from person to location) was only based on the address. I felt it best just to exclude these columns as most sources will not include this data.
@burrowse in case it's of interest, this PR includes changes which allow us to properly populate visit_detail (rather than duplicating VO with different IDs like is currently done in ETL-Synthea). |
My initial intention was to knock out some of the minor issues in our backlog, but one thing led to another and I ended up refactoring our visit and cost models. They're simpler and more performant, now, and we have a proper visit_detail table rather than a placeholder.
This PR includes the following changes. See my inline comments for more detail. Sorry the PR is so huge; I know it makes it harder to review!
visit_occurrence_id
- now happens in a single model rather than being spread across 3visit_detail
tabletotal_paid
andpaid_by_patient
columns - it wasn't clear to me from Synthea docs that we could make this leap. So I nulled these columns out insteadclaims
andclaims_transactions
. I will file an issue for this work