-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experimental: defining "extension" slot as a list of key-value pairs #263
Closed
Closed
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
# curie_map: | ||
# FBbt: http://purl.obolibrary.org/obo/FBbt_ | ||
# UBERON: http://purl.obolibrary.org/obo/UBERON_ | ||
# ZFA: http://purl.obolibrary.org/obo/ZFA_ | ||
# MYINTERNALNAMESPACE: http://my.internal.ns/vocab/ | ||
# semapv: https://w3id.org/semapv/ | ||
# skos: http://www.w3.org/2004/02/skos/core# | ||
# dc: http://purl.org/dc/terms/ | ||
# license: https://w3id.org/sssom/license/unspecified | ||
# mapping_set_id: https://w3id.org/sssom/mappings/12345287613876278135 | ||
# extension_definition: | ||
# - key: modified | ||
# value: dc:modified | ||
# - key: mapping_id | ||
# value: MYINTERNALNAMESPACE:mapping_id | ||
# - key: funded_by | ||
# value: foaf:fundedBy | ||
subject_id predicate_id object_id mapping_justification modified mapping_id funded_by | ||
UBERON:0000004 skos:exactMatch ZFA:0000047 semapv:UnspecifiedMatching 2022-03-01 M6972619762 NIH:RM1-HG010860-01 | ||
UBERON:0000005 skos:exactMatch FBbt:00005157 semapv:UnspecifiedMatching 2022-02-01 M98879876121 NIH:RM1-HG010860-01 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about the risk of clashes if a future version of SSSOM defines, say,
mapping_id
as a standard field?I’d suggest mandating that columns defined by this extension mechanism should be prefixed, e.g. with
ext_
or similar.That is, if an extension is defined as follows:
Then in the TSV file it should appear as:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either that or adding a SSSOM version number in the mapping set metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is exactly what I was musing about before I stopped working on it. The downside of using "ext" prefixes for column names is that I already know projects using SSSOM that really don't want to change their column names which are part of a large metadata model for mappings (with a few organisation specific columns like, ehem, species, Grant_id, and other Shananigans. I would have much liked the prefix solution for cleanliness.
My thought was that if a custom column is defined in the mapping set header, it trumps any definitions in sssom schema. This is super messy, but it's unclear to me how I should even approach this question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suggestion:
reserved
, meaning that future SSSOM versions will never reuse those names.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may not be the cleanest solution, but it’s a pretty common way of dealing with existing legacy stuff in specifications and standards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very French way to go about it, while the real world clearly works under Mediterranean principles.. People will just not care about this constraint in the spec - the only consequence it will have is that we can't use normal processing tools on many mapping sets in the wild. I know I sound negative but man, after 10 years of lobbying with great charme and all we barely managed to get people to agree to publish ontologies in a common syntax - it has proven nearly impossible to tell these same people to use a common set of annotation properties to describe their metadata.
On the other hand, I cant see a different solution either! I am 51% inclined to go your way. I wouldn't even bother to enumerate existing columns, they are too messy anyways. Just force the
ext_
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a real-world approach. Look at any IETF RFCs, any IANA registry: they are littered with this kind of “reserved” attributes, keywords, identifiers. Most of them exist for only one reason: to deal with this kind of mess and the fact that many people don’t give a shit about standards, and standards have to take that into account.