Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add vital signs mappings for MIMIC-IV v2.0 - SSSOM format #1417

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

a-chahin
Copy link
Contributor

@a-chahin a-chahin commented Nov 8, 2022

This pull request adds two mapping csv files for vital signs concepts from the d_items and chartevents tables in MIMIC-IV v2.0. The first file chartevents_to_loinc.csv contains itemid to LOINC mappings. The second file chartevents_to_omop.csv contains itemid to OMOP mappings.

Both files use the simple standard for sharing ontology mappings SSSOM format @sssom

@tompollard
Copy link
Member

Thanks @a-chahin! Same comments as: #1418 and #1419 (comment). Also tagging @matentzn in case he would like to add his thoughts.

mimic:227632,Arctic Sun/Alsius Temp #1 C,skos:closeMatch,loinc:8310-5,Body temperature,HumanCurated,orcid:0000-0001-8822-1884,orcid:0000-0002-9348-9284,0.8,
mimic:227634,Arctic Sun/Alsius Temp #2 C,skos:closeMatch,loinc:8310-5,Body temperature,HumanCurated,orcid:0000-0001-8822-1884,orcid:0000-0002-9348-9284,0.8,
mimic:226707,Height,skos:exactMatch,loinc:8302-2,Body height,HumanCurated,orcid:0000-0001-8822-1884,orcid:0000-0002-9348-9284,1,
mimic:226730,Height (cm),skos:exactMatch,loinc:8302-2,Body height,HumanCurated,orcid:0000-0001-8822-1884,orcid:0000-0002-9348-9284,1,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made some comments in #1418 so I don't repeat them here, but I am just nothing that there is something odd about using exactMatch for unit-restriced terms mapped to unit-unrestricted terms. Not necessarily wrong, just.. Should be considered!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @matentzn! Do you think "skos:closeMatch" is a better predicate_id in this case?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure. Probably better, but broadMatch may also be worthy of consideration, if you consider:

height (cm) ---> height it almost seems like height (cm) is simply a narrower interpretation of a height. But you may be right as well.. I don't know, sorry!

@tompollard
Copy link
Member

Thanks again @matentzn

@tompollard
Copy link
Member

@danamouk it looks like your new pull request at #1660 is duplicating a lot of the work in this pull request. I think it would be best to:

  • review this pull request, and then merge if appropriate
  • add any new mappings or fixes to this mapping table

If Northwestern aren't able to provide their mapping in SSSOM format, then we should, if possible, convert to SSSOM format for them. If there are columns that are difficult to populate, then we can leave them empty and complete them later.

Copy link

@danamouk danamouk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tompollard for the suggestions. I have added the new mappings for both mimic and northwestern across both loinc and omop to comply with the SSSOM data structure.

Copy link

@matentzn matentzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. CSV is not a well defined format, I would suggest TSV.
  2. HumanCurated is from an old version of sssom, use semapv:ManualMappingCuration instead. See example here: https://mapping-commons.github.io/sssom/tutorial/#automated-processing-1-creating-an-embedded-sssom-file

Otherwise looks great!

@danamouk
Copy link

danamouk commented Nov 7, 2023

Thanks @matentzn for the feedback! I have updated the mapping justification to semapv:ManualMappingCuration and updated the CSV to TSV file. @tompollard we can merge the changes if everything looks good!

@matentzn
Copy link

matentzn commented Nov 8, 2023

This looks great!

For the future, you may want to consider to set up a proper mapping registry (https://mapping-commons.github.io/sssom/mapping-commons/). This allows you to get features like mapping set validation (schema validation, and more) as part of your pull requests.

@tompollard
Copy link
Member

@matentzn, please could you say more about the following comment?

CSV is not a well defined format, I would suggest TSV.

We've always used CSV as our default format and it is fairly well defined as part of RFC 4180.

@danamouk, TSV is inconsistent with (1) the other data files in this repository and (2) the data files that you will be releasing as part of the Northwestern project.

@matentzn
Copy link

matentzn commented Nov 8, 2023

It's not a big deal, but we have always pushed for TSV from everything, because, among others, of the need to escape the (frequent) commas in entity labels.. you can scroll through here for details: https://mapping-commons.github.io/sssom/spec/#serialisation

It also has the advantage that standard tools (esp. validators) can process the files more easily, but I guess this is debatable.

@tompollard
Copy link
Member

It's not a big deal, but we have always pushed for TSV from everything, because, among others, of the need to escape the (frequent) commas in entity labels

Following the RFC spec by quoting strings (or at least strings with commas) solves the issue of commas in labels. " characters would need escaping, but I'm guessing these don't appear especially frequently in the labels?

Tabs appear pretty frequently in the strings that we work with, and the benefit of commas over tabs is that they are easier to see!

@matentzn
Copy link

matentzn commented Nov 8, 2023

Sure,sounds good!

If you get make an issue at https://github.com/mapping-commons/sssom/issues explaining why you prefer to, and will, push for CSv, that will be helpful in prioritising respective tool support! Should not be an issue.

Sorry about the suggestion wit TSV!

@tompollard
Copy link
Member

If you get make an issue at https://github.com/mapping-commons/sssom/issues explaining why you prefer to, and will, push for CSV, that will be helpful in prioritising respective tool support

Thanks, I'm glad you raised this point and it is helpful to talk this through. It's not so much that I think one is better than the other, more that we made a decision to use CSV in the past and I'm trying to understand whether there are good reasons to switch.

@danamouk
Copy link

danamouk commented Nov 9, 2023

Thanks @tompollard & @matentzn for your suggestions on our initial vital signs mapping! This will be really helpful as we map the rest of the mappings to SSSOM format. I have retained the csv file format and updated the mapping justification to semapv:ManualMappingCuration.

@tompollard if we are happy with the changes, feel free to merge this pull request #1417 , and close #1660 and we can update any issues or changes that come up in another pull request!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants