-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework the literal profile idea #382
Changes from all commits
86a2905
b31d851
1c1a002
7755f56
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -49,6 +49,7 @@ enums: | |
meaning: rdfs:Datatype | ||
rdf property: | ||
meaning: rdf:Property | ||
sssom literal: A special type of literal that is used in SSSOM files to express that the subject_id is not an entity reference. | ||
predicate_modifier_enum: | ||
permissible_values: | ||
Not: Negating the mapping predicate. The meaning of the triple becomes subject_id is not a predicate_id match to object_id. | ||
|
@@ -118,15 +119,17 @@ slots: | |
subject_id: | ||
description: The ID of the subject of the mapping. | ||
range: EntityReference | ||
required: true | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. unfortunate side effect, need some help to ensuring that either this or There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don’t think this is feasible in pure LinkML.¹ So it would have to be done on top of LinkML, by adding a paragraph to the spec to explicitly state that Implementations that use the Python LinkML runtime may need to be updated to cope with that change, if they were previously relying on the runtime to check that ¹ Feel free to ask people who know LinkML better than I do (so, approximately everyone) just in case. But as far as I can tell, we cannot express variable constraints in LinkML, such as “this slot is required IFF that other slot has this particular value”. |
||
mappings: | ||
- owl:annotatedSource | ||
slot_uri: owl:annotatedSource | ||
examples: | ||
- value: HP:0009894 | ||
description: The CURIE denoting the Human Phenotype Ontology concept of 'Thickened ears' | ||
literal: | ||
subject_literal: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was rather envisioning reusing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I know, but subject_label has a very different purpose than subject_literal. Your proposal effectively changes the purpose of subject_label if the "literal mode" is switched on, so I can probably be convinced to using subject_label as you say. My intention was to seperate the "literal" and "entity" modes a tiny bit more cleanly, but probably, I am just introducing churn... One more push with a ring finger and you will have me convinced! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes it does. And I think it is a better solution (not perfect, but better) than adding a new slot that is only meaningful when the “literal mode is on” (i.e. when If we reuse If we create There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I am going to be blunt (blunter than usual, I mean :P ): the SSSOM data model has not been designed to separate things cleanly. A “clean design” would not have put all the metadata slots directly in the Instead, a clean design would have another class specifically to represent an entity being mapped (let’s call it This would have been much cleaner, would have make manipulating the mappings and their entities much easier (inverting a mapping, for example, would simply have involved swapping the My understanding, based on some old discussions that I have read in this repo, is that such a design has been deliberately avoided because there was a clear preference for a data model that was much closer to the TSV serialisation (i.e., where each column of the TSV maps directly to one slot in the I do not criticize that choice, but now we have to stick to the “flat” model that we have as a result. Trying to introduce some clean object-oriented design on top of it would only make things more confusing. (Hence my refusal to the entire idea of having a separate |
||
description: The literal being mapped | ||
see_also: | ||
- https://mapping-commons.github.io/sssom/sssom-profiles/ | ||
- https://github.com/mapping-commons/sssom/issues/234 | ||
range: string | ||
required: true | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If you insist on having that separate slot (rather than reusing |
||
mappings: | ||
|
@@ -135,7 +138,7 @@ slots: | |
examples: | ||
- value: "Alzheimer" | ||
description: A string referring to some thing. | ||
literal_datatype: | ||
subject_literal_datatype: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The only new field, the rest can be reused. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is that really necessary? Do we expect people to map other values than pure strings (like, integers or dates)? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This specific one may not be, but other like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just noting that if we decided to reuse |
||
description: The datatype of the literal being mapped | ||
range: uri | ||
required: false | ||
|
@@ -386,20 +389,6 @@ slots: | |
examples: | ||
- value: http://purl.obolibrary.org/obo/mondo/releases/2021-01-30/mondo.owl | ||
description: (A persistent Version IRI pointing to the Mondo version '2021-01-30') | ||
literal_source: | ||
description: URI of ontology source for the literal. | ||
range: EntityReference | ||
examples: | ||
- value: obo:mondo.owl | ||
description: A persistent OBO CURIE pointing to the latest version of the Mondo ontology. | ||
- value: wikidata:Q7876491 | ||
description: A Wikidata identifier for the Uberon ontology resource. | ||
literal_source_version: | ||
description: Version IRI or version string of the source of the literal. | ||
range: string | ||
examples: | ||
- value: http://purl.obolibrary.org/obo/mondo/releases/2021-01-30/mondo.owl | ||
description: (A persistent Version IRI pointing to the Mondo version '2021-01-30') | ||
object_source: | ||
description: URI of vocabulary or identifier source for the object. | ||
range: EntityReference | ||
|
@@ -526,13 +515,6 @@ slots: | |
examples: | ||
- value: semapv:Stemming | ||
- value: semapv:StopWordRemoval | ||
literal_preprocessing: | ||
description: Method of preprocessing applied to the literal. | ||
range: EntityReference | ||
multivalued: true | ||
examples: | ||
- value: semapv:Stemming | ||
- value: semapv:StopWordRemoval | ||
curation_rule: | ||
description: A curation rule is a (potentially) complex condition executed by an agent that led to the establishment of a mapping. | ||
Curation rules often involve complex domain-specific considerations, which are hard to capture in an automated fashion. The curation | ||
|
@@ -661,6 +643,8 @@ classes: | |
- subject_id | ||
- subject_label | ||
- subject_category | ||
- subject_literal | ||
- subject_literal_datatype | ||
- predicate_id | ||
- predicate_label | ||
- predicate_modifier | ||
|
@@ -698,52 +682,10 @@ classes: | |
- object_preprocessing | ||
- semantic_similarity_score | ||
- semantic_similarity_measure | ||
- see_also | ||
- issue_tracker_item | ||
- other | ||
- comment | ||
class_uri: owl:Axiom | ||
literal mapping: | ||
description: Represents an individual mapping between a literal and an entity. | ||
Note that this schema has been created on 01.08.2023 and is subject to change. | ||
see_also: | ||
- https://mapping-commons.github.io/sssom/sssom-profiles/ | ||
slots: | ||
- literal | ||
- literal_datatype | ||
- predicate_id | ||
- predicate_label | ||
- predicate_modifier | ||
- object_id | ||
- object_label | ||
- object_category | ||
- mapping_justification | ||
- author_id | ||
- author_label | ||
- reviewer_id | ||
- reviewer_label | ||
- creator_id | ||
- creator_label | ||
- license | ||
- literal_source | ||
- literal_source_version | ||
- object_type | ||
- object_source | ||
- object_source_version | ||
- mapping_provider | ||
- mapping_source | ||
- mapping_cardinality | ||
- mapping_tool | ||
- mapping_tool_version | ||
- mapping_date | ||
- confidence | ||
- object_match_field | ||
- match_string | ||
- literal_preprocessing | ||
- object_preprocessing | ||
- similarity_score | ||
- similarity_measure | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The diff is conveniently obfuscating the fact that these two slots are added to the mappings class. They should have been there from the start anyways, but just being transparent. |
||
- see_also | ||
- issue_tracker_item | ||
- other | ||
- comment | ||
class_uri: owl:Axiom | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
current path based on your proposal @gouttegd:
sssom literal
as a special case in the already existingentity_type
field (as you suggested)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? Seems like an needless complication to me.
In particular, this means that literal mappings cannot possibly be inverted. And I don’t know about the use cases of those who would manipulate literal mappings, but personally I invert (non-literal) mappings all the time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, saw your comment below about how this would “go against the whole idea of the format”.
Not convinced, sorry. Once we admit that we can have mapping where one side is a literal rather than an entity, I don’t see how it matters which side is the literal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ooooookeeeeeeee :P I will fix it. Now sleeping time! :D