Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic Mapping Pipeline #112

Open
josephjclark opened this issue Nov 13, 2024 · 0 comments
Open

Basic Mapping Pipeline #112

josephjclark opened this issue Nov 13, 2024 · 0 comments

Comments

@josephjclark
Copy link
Collaborator

josephjclark commented Nov 13, 2024

This feature is to create a basic mapping pipeline for the Mapping service

The focus is on getting an end-to-end pipeline. Quality and UX are secondary at this stage.

The service needs to be exposed via REST on the apollo server.

Inputs

It should accept a payload like this:

{
  input: [], // array of strings to map. The first word in the string is the key
  vocabs: [], // string list of target vocabularies, by name, to map to.
            // I'm quite happy to just accept one string for the time being!
}

Implementation Notes

For each value in the input list:

  • Use similarity analysis to find possible matches in the embeddings (ie, load N possible matches from the database)
  • Call a model to figure out how the best mapping based on the shortlist of possibilities
  • Fit the mapped value into the correct data structure

Repeat this process until all inputs are mapped.

Return a JSON structure.

Limitations

Ironically this service isn't supposed to be particularly smart. Here are some big limitations:

  • You can only use vocabularies that are already defined in the vector database. It's a hard-coded, limited list
  • We expect a rigid input format. The difficult problem of how to take eg a list of commcare inputs and build them into the correct data structure is completely ignored.Sorry users, you've gotta do this bit yourself.
  • The output is fixed to a fhir coding { display, system, value }
  • There is no validation
  • There is no ambiguity handling
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant