Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add 'composed entity' value to the EntityType enum. #403

Merged
merged 2 commits into from
Dec 11, 2024

Conversation

gouttegd
Copy link
Contributor

@gouttegd gouttegd commented Dec 2, 2024

Resolves #402

  • docs/ have been added/updated if necessary (not needed; the new enum value is documented directly in the model’s description field)
  • make test has been run locally
  • tests have been added/updated (if applicable) (nothing to test; the new value does not change anything to the expected behaviours)
  • CHANGELOG.md has been updated.

If you are proposing a change to the SSSOM metadata model, you must

  • provide a full, working and valid example in examples/
  • provide a link to the related GitHub issue in the see_also field of the linkml model
  • provide a link to a valid example in the see_also field of the linkml model
  • run SSSOM-Py test suite against the updated model

This PR adds a new value composed entity to the EntityType enumeration to indicate, as proposed in #402, that an ID is intended to represent a composed (aka “complex”, aka “post-coordinated”) entity that involves several individual entities.

@gouttegd gouttegd self-assigned this Dec 2, 2024
@gouttegd
Copy link
Contributor Author

gouttegd commented Dec 2, 2024

@matentzn Two things that might be worth discussing:

(A) I have not specified a meaning for the new value. Mostly because it’s unclear to me what the… meaning of the meaning slot actually is (LinkML’s documentation is not really helpful here, I believe), and in particular whether it is supposed to be normative or merely informative.

If a meaning is absolutely required, a possible candidate could be owl:ClassExpression, but I don’t like the fact that it seems to tie the concept of composed entity to OWL.

(B) I have explicitly forbidden the use of composed entity in the predicate_type slot. That is because, while it could theoretically be possible to use “post-coordinated predicates”, I don’t think this would be a good idea and I’d rather cut it short immediately. But I am open to counter-arguments on this.

@gouttegd gouttegd requested a review from matentzn December 2, 2024 20:11
Add a new value 'composed entity' to the EntityType enumeration to
indicate that an ID is intended to represent a composed (aka "complex",
aka "post-coordinated") entity that involves several individual
entities.

closes #402
@gouttegd gouttegd force-pushed the add-composed-entity-type branch from c32456e to f4358b5 Compare December 2, 2024 20:15
matentzn
matentzn previously approved these changes Dec 3, 2024
Copy link
Collaborator

@matentzn matentzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I absolutely love it! I will get maybe two more reviews on this.

It is believed that 'composed entity expression' better conveys the idea
that the entity being referred to is a composite entity, compared to
'composed entity' alone.
Copy link
Collaborator

@matentzn matentzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go with this - I think we are the main stakeholders here anyways, and we have had this PR open one work week. i will merge on Tuesday and create a release.

@gouttegd
Copy link
Contributor Author

gouttegd commented Dec 8, 2024

If a meaning is absolutely required, a possible candidate could be owl:ClassExpression, but I don’t like the fact that it seems to tie the concept of composed entity to OWL.

I just realised that the meaning slot in enum values is used by SSSOM-Py when serialising to RDF. For example, a mapping with a subject_type set to owl class will be serialised in RDF as:

[ a owl:Axiom ;
    owl:annotatedProperty THE:PREDICATE ;
    owl:annotatedSource THE:SUBJECT ;
    owl:annotatedTarget THE:OBJECT ;
    sssom:subject_type owl:Class ]

With no meaning set for composed entity expression, that particular value would be serialised as a literal string instead:

[ a owl:Axiom ;
    owl:annotatedProperty THE:PREDICATE ;
    owl:annotatedSource THE:SUBJECT ;
    owl:annotatedTarget THE:OBJECT ;
    sssom:subject_type "composed entity expression" ]

@sierra-moxon
Copy link
Contributor

sierra-moxon commented Dec 9, 2024

Do you think it would be worthwhile to constrain subject_id so that the composition has a standard format?
SCHEMA:0001/(disease:'MONDO:0005148',phenotype:'HP:0009124') - or if not a constraint on the range (I am thinking of a regex pattern constraint), then another example on the subject_id slot so that folks know how to compose?
I am also wondering if the object can be composed as well? Could an example that elucidates this be added as well?

@gouttegd
Copy link
Contributor Author

Do you think it would be worthwhile to constrain subject_id so that the composition has a standard format?

The idea is rather to avoid any assumption or constraint about the exact format of composed entities. The URI Expression proposal is just that: a proposal. People may be happy with that proposal or they may prefer to use another way to encode composed entities.

We do not enforce any constraint on the form of IDs for any other type of entities -- IDs are just opaque strings; most of the time they will be typical URIs but we do not force them to be, they may very well be something else entirely --, so there should be no reason to enforce a particular form of ID for composed entities.

If we want to go that route and say that composed entities in SSSOM must be represented with URI Expressions and nothing else, why not but then the name of the enum value should be changed to something that explicitly refers to URI Expressions, like uri expression. composed entity expression would not be suitable because it is too generic (we chose that name precisely because it was generic and not referring to any format).

I am also wondering if the object can be composed as well?

Yes. This is implicit in the fact that both the subject and the object can have a EntityType value; there is both a subject_type slot and a object_type slot, and there is nothing explicitly forbidding the use of any particular EntityType value in either of those slots.

@matentzn
Copy link
Collaborator

matentzn commented Dec 10, 2024

For now, I think we should not mandate a specific approach in SSSOM, because I don't want to close any doors to people that, for example, what to use crazy things like:

  1. URL encoded OWL class expressions in RDF/XML format
  2. URL encoded OTTR instantiations

Etc if they so must. I plan, however, to explore how useful the URI expression language is to cover a variety of cases, and if it so holds up, and I see more than two distinct organisations implementing it, I might make a bid to lift it into the standard as a recommendation, with optional support by tools (sssom-java already supports it despite absolutely not having to do so).

I will probably implement some rudimentary support in sssom py which will work like this:

if m.subject_type == "composite entity expression"
      try:
          parse_json_url(curies.identifier(m.subject_id))
       except: ...

@matentzn matentzn merged commit 7d7bdfe into master Dec 11, 2024
3 checks passed
@matentzn matentzn deleted the add-composed-entity-type branch December 11, 2024 13:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New EntityType value to identify composed entities
4 participants