Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional ORCID provenance column #31

Merged
merged 4 commits into from
Jan 8, 2025
Merged

Add optional ORCID provenance column #31

merged 4 commits into from
Jan 8, 2025

Conversation

cthoyt
Copy link
Collaborator

@cthoyt cthoyt commented Jan 8, 2025

Closes #30

This PR does the following:

  • Add the ability to add an optional ORCID column (labeled "contributor") to each curated TSV
  • Adds two example ORCID annotations
  • Updates the output to incorporate ORCID provenance in SSSOM and ontology output
  • Update data integrity tests (will do this if the PR gets positive feedback and the questions below are answered)

It has the minor drawback of creating a big diff, but it's mostly localized to generated artifacts.

Questions before merging this, if everyone thinks it's a good idea:

  1. Should we enforce that the contributor column is always present, even if it's mostly empty?
  2. Should we curate a list of ORCID->Name mappings to make these outputs (namely, the ontology) nicer to read in Protege? It's also possible to quickly interact with the ORCID REST API to look up names as an alternative.

@cthoyt cthoyt requested a review from egonw January 8, 2025 11:31
Copy link
Member

@egonw egonw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not convinced this simple provenance model will be final (it's very limited), but let's give it a try.

Can you fix the error?

E               AssertionError: Lists differ: ['#did', 'when', 'nextofkin'] != ['#did', 'when', 'nextofkin', 'contributor']
E               
E               Second list contains 1 additional elements.
E               First extra element 3:
E               'contributor'
E               
E               - ['#did', 'when', 'nextofkin']
E               + ['#did', 'when', 'nextofkin', 'contributor']
E               ?                             +++++++++++++++

@cthoyt cthoyt marked this pull request as ready for review January 8, 2025 13:16
@cthoyt
Copy link
Collaborator Author

cthoyt commented Jan 8, 2025

@egonw done

@cthoyt cthoyt changed the title Demonstrating adding a row-wise ORCID column Add optional ORCID provenance column Jan 8, 2025
@cthoyt cthoyt requested a review from egonw January 8, 2025 14:25
@egonw egonw merged commit 20dc268 into main Jan 8, 2025
1 check passed
@egonw
Copy link
Member

egonw commented Jan 8, 2025

Thanks, @cthoyt

@cthoyt cthoyt deleted the provenance-model branch January 8, 2025 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add contributor ORCID column
2 participants