Skip to content

Commit

Permalink
Update synonym type docs
Browse files Browse the repository at this point in the history
  • Loading branch information
matentzn committed Apr 25, 2024
1 parent 8172152 commit 68936b0
Show file tree
Hide file tree
Showing 7 changed files with 106 additions and 96 deletions.
2 changes: 1 addition & 1 deletion docs/howto/edit-in-protege.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ To add a dbxref to the definition:
To add a synonym:

1. Select the + button to add an annotation to the selected entity
1. Add the synonyms as 'has_exact_synonym' (note: use [appropriate synonym annotation](../reference/synonyms-obo.md))
1. Add the synonyms as 'has_exact_synonym' (note: use [appropriate synonym annotation](../reference/synonyms-properties.md))
1. Synonyms should have a reference to it
1. Click the @ symbol next to the synonym
1. Click the + button
Expand Down
25 changes: 2 additions & 23 deletions docs/lesson/synonyms.md
Original file line number Diff line number Diff line change
Expand Up @@ -213,7 +213,7 @@ Synonyms are hugely important for many use cases in the information and data dom
**Duration**: 25 min

- [OBO modeling and synonyms](#model)
- [Overview of synonym properties](../reference/synonyms-obo.md)
- [Overview of synonym properties](../reference/synonyms-properties.md)
- [Overview of synonym types](../reference/synonyms-types.md)

<a id="model"></a>
Expand All @@ -227,30 +227,9 @@ There are two hierarchies of synonym types of particular relevance to synonym re
- [Synonym Properties](https://ontobee.org/ontology/OMO?iri=http://purl.obolibrary.org/obo/IAO_0000118)
- [Synonym Types](https://ontobee.org/ontology/OMO?iri=http://www.geneontology.org/formats/oboInOwl%23SynonymTypeProperty)

!!! warning

While _synonym properties_ MUST be included in OMO to recognise a valid synonym, _synonym types_ are often defined by the ontologies themselves.
This can be pretty confusing for tool developers. For example, at the time of this writing, [a number of OBO ontologies](https://github.com/OBOFoundry/OBOFoundry.github.io/issues/2450) define their own properties for "layperson" or "plural form".

Apart from the two core hierarchies, the situation for provenance related properties on synonyms is a bit chaotic (not standardised) at the time of writing this reference (21.04.2024).

Ontologies developed using the [GO-family ontology development pattern](../pathways/ontology-curator-go-style.md) use the oboInOwl:hasDbXref property to represent "provenance" in general.
This could be anything:

1. An ORCiD, meaning "this person asserted/verified this synonym"
2. An ontology term, meaning "the synonym was source from this ontology concept"

Some ontologies have started using dc:contributor and rdfs:seeAlso for more fine-grained provenance, but this pattern is not widely adopted.

!!! info

We warmly recommend to use to be generous with provenance when curating synonyms.
At the very least, we recommend to capture the ORCiD of the curator that captured the synonym, or the Research Organization Registry (ROR) identifier of the organisation that promotes the term.
Ideally, however, you furthermore capture the source of the synonym, which could be a PubMed ID (PMID), a term from an ontology or a Digital Object Identifier (DOI).

For an overview of the two synonym properties and types, we will refer to the specialised documentation pages here:

- [Overview of synonym properties](../reference/synonyms-obo.md)
- [Overview of synonym properties](../reference/synonyms-properties.md)
- [Overview of synonym types](../reference/synonyms-types.md)

<a id="validation"></a>
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/go-style-annotation-property-practice.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Note that while most of the practices documented here apply to all OBO ontologie
| Label | rdfs:label | Y | Max 1 \* | Full name of the term, must be unique. | Free text | None | \* some ontologies have multiple labels for different languages, in which case, there should maximum be one label per language |
| Definition | IAO:0000115 | Y | Max 1 | A textual definition of ther term. In most ontologies, must be unique. | Free text | database_cross_reference: reference materials used and contributors (in ORCID ID link format) | See [this document](https://douroucouli.wordpress.com/2019/07/08/ontotip-write-simple-concise-clear-operational-textual-definitions/) for guide on writing definitions |
| Contributor | dcterms:contributor | N (though highly reccomended) | No limit | The ORCID ID of people who contributed to the creation of the term. | ORCID ID (using full link) | None | |
| Synonyms | http://www.geneontology.org/formats/oboInOwl#hasExactSynonym, http://www.geneontology.org/formats/oboInOwl#hasBroadSynonym, http://www.geneontology.org/formats/oboInOwl#hasNarrowSynonym, http://www.geneontology.org/formats/oboInOwl#hasRelatedSynonym | N | No limit | Synonyms of the term. | Free text | database_cross_reference: reference material in which the synonymn is used | See [synonyms documentation](../reference/synonyms-obo.md) for guide on using synonyms |
| Synonyms | http://www.geneontology.org/formats/oboInOwl#hasExactSynonym, http://www.geneontology.org/formats/oboInOwl#hasBroadSynonym, http://www.geneontology.org/formats/oboInOwl#hasNarrowSynonym, http://www.geneontology.org/formats/oboInOwl#hasRelatedSynonym | N | No limit | Synonyms of the term. | Free text | database_cross_reference: reference material in which the synonymn is used | See [synonyms documentation](../reference/synonyms-properties.md) for guide on using synonyms |
| Comments | rdfs:comment | N | Max 1 | Comments about the term, extended descriptions that might be useful, notes on modelling choices, other misc notes. | Free text | database_cross_reference: reference material relating to the comment | See [documentation on comments](../explanation/term-comments.md) for more information about comments |
| Editor note | IAO:0000116 | N | Max 1 | A note that is not relevant to front users, but might be to editors | Free text | database_cross_reference: reference material relating to the note | |
| Subset | [http://www.geneontology.org/formats/oboInOwl#inSubset](http://www.geneontology.org/formats/oboInOwl#inSubset) | N | No limit | A tag that marks a term as being part of a subset | annotation property that is a subproperty of subset_property (see [guide](../howto/add-new-slim.md) on how to select this) | None | See [Slim documentation](../howto/add-new-slim.md) for more information on subsets |
Expand Down
69 changes: 17 additions & 52 deletions docs/reference/synonym-validation.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
## Synonym validation

### Related materials

- [Overview of synonym properties](../reference/synonyms-properties.md)
- [Synonym types](../reference/synonyms-types.md)
- [Lesson on synonyms](../lesson/synonyms.md)

### Basic validation

#### The same synonym cannot be an exact synonym of two distinct concepts
Expand All @@ -11,10 +17,11 @@

Some exact synonyms are not globally unique. For example, the acronym "ASD" is an exact synonym of the concept representing "Atrial septal defect" and "Autism Spectrum Disorder".

#### The same synonym cannot be duplicated with a different scope
#### The same synonym cannot be duplicated with a different scope

- An entity has duplicate synonyms with different properties (e.g. the same broad and related synonym). This causes ambiguity. An example screenshot from Mondo is pasted below.
- Implemented in the form of [Duplicate Scoped Synonyms](https://robot.obolibrary.org/report_queries/duplicate_scoped_synonym) check in the ROBOT report.
- An entity has duplicate synonyms (the same exact string value, like "Depression") with different properties (e.g. broad and related). This causes ambiguity and considerable confusion among downstream users.
- This test is implemented in the form of [Duplicate Scoped Synonyms](https://robot.obolibrary.org/report_queries/duplicate_scoped_synonym) check in the ROBOT report.
- Unfortunately, quasi scope-duplicated synonyms cannot always we recognised easily. In the below example, you can see 3 synonyms that _almost_ seem like they are the same, but have different scopes. Better tools are needed to recognise such cases.

<img width="248" alt="image" src="https://github.com/OBOAcademy/obook/assets/6722114/19aa41ec-167c-4db7-8741-decf370fcb5b">

Expand All @@ -26,21 +33,21 @@
- For historical reasons, many ontologies avoid attaching synonym metadata and provenance to the primary label of a class. For example, the Mondo ontology captures the preferred labels of various major nomenclature organisations. Instead of capturing which organisations prefer the label on the primary label, they are captured as "exact syonyms", even though the two often co-incide.
- It is often considered more convenient to be able to expect _all_ exact synonyms to be available via `oboInOwl:hasExactSynonym`, and not requiring downstream users to _know_ that exact synonyms are scattered across multiple properties (such as `rdfs:label`).
- No matter whether you agree or disagree with the above, as a ontology _user_ you should not assume

#### Synonym types must be a child of Synonym Type Property

- A synonym type is used in an annotation, but is not properly declared as a child of oboInOwl:SynonymTypeProperty. This can cause problems with conversions to OBO format.
- For example, if you add your own synonym type, like abbreviation, it has to be child of oboInOwl:SynonymTypeProperty
- For example, if you add your own synonym type, like `hp:abbreviation`, it has to be child of oboInOwl:SynonymTypeProperty to be correctly interpreted by ROBOT and generally OWL API related tooling.
- Implemented in the form of [Missing Synonym Type Declaration](https://robot.obolibrary.org/report_queries/missing_synonymtype_declaration) in the ROBOT report.

### Advanced validation

##### Duplicate exact synonym check that excludes abbreviations
- In Mondo, this SPARQL query checks for duplicate exact synonyms between terms but excludes any abbreviations.
- For example, "SMS" is an abbreviation for MONDO:0008491 stiff-person syndrome and MONDO:0008434 Smith-Magenis syndrome and this is acceptable.
- Implemented as [qc-duplicate-exact-synonym-no-abbrev.sparql](https://mondo.readthedocs.io/en/latest/editors-guide/quality-control-tests/#qc-duplicate-exact-synonym-no-abbrevsparql) in Mondo.
- Implemented as [qc-duplicate-exact-synonym-no-abbrev.sparql](https://mondo.readthedocs.io/en/latest/editors-guide/quality-control-tests/#qc-duplicate-exact-synonym-no-abbrevsparql) in Mondo (see below).

??? Query
??? Query qc-duplicate-exact-synonym-no-abbrev.sparql

```
PREFIX obo: <http://purl.obolibrary.org/obo/>
Expand Down Expand Up @@ -87,58 +94,16 @@
ORDER BY DESC(UCASE(str(?value)))
```

##### Duplicate OMIM synonyms as exact and related in Mondo
- In Mondo, this SPARQL query checks for duplicate synonyms between OMIM terms that are both exact and related.
- This is a very specific use case to Mondo, as OMIM synonyms were initially brought in as related synonyms.
- Implemented as [qc-related-exact-synonym-omim.sparql](https://mondo.readthedocs.io/en/latest/editors-guide/quality-control-tests/#qc-related-exact-synonym-omimsparql) in Mondo.

??? Query

```
prefix owl: <http://www.w3.org/2002/07/owl#>
prefix oboInOwl: <http://www.geneontology.org/formats/oboInOwl#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT DISTINCT ?entity ?property ?value WHERE {
{
?entity oboInOwl:hasRelatedSynonym ?related ;
rdfs:label ?entity_label ;
oboInOwl:hasExactSynonym ?exact ;
a owl:Class .
[
a owl:Axiom ;
owl:annotatedSource ?entity ;
owl:annotatedProperty oboInOwl:hasExactSynonym ;
owl:annotatedTarget ?exact ;
oboInOwl:hasDbXref ?xref1
] .
[
a owl:Axiom ;
owl:annotatedSource ?entity ;
owl:annotatedProperty oboInOwl:hasRelatedSynonym ;
owl:annotatedTarget ?related ;
oboInOwl:hasDbXref ?xref2
] .

FILTER (str(?related)=str(?exact))
FILTER (regex(str(?xref1), "OMIM^"))
FILTER (regex(str(?xref2), "OMIM^"))
FILTER (isIRI(?entity) && STRSTARTS(str(?entity), "http://purl.obolibrary.org/obo/MONDO_"))
BIND(oboInOwl:hasExactSynonym as ?property)
}
}

ORDER BY ?entity
```
##### Exact Synonyms/Non-exact Mappings
- In Mondo, this SPARQL query checks for an exact synonym and a database cross-reference (dbxref) that is not exact. If the dbxef is equivalent to the Mondo term, the synonyms from that term should be added as exact synonyms.

- In Mondo, this SPARQL query checks for an exact synonym and a database cross-reference (dbxref) that is not exact. If the dbxef is equivalent to the Mondo term, the synonyms from that term should be added as exact synonyms.
- This is a very specific use case to Mondo, as dbxrefs in Mondo have equivalence mappings (in the form of MONDO:equivalentTo). The issue here was, in a merger, DOID:5603 was added as an equivalent dbxref, but the synonyms 'T-cell acute lymphoblastic leukemia' and 'precursor T lymphoblastic leukemia' were related synonyms. They were changed to exact and the QC check passed.
- Implemented as [qc-exact-synonyms-non-exact-mappings.sparql](https://mondo.readthedocs.io/en/latest/editors-guide/quality-control-tests/#qc-exact-synonyms-non-exact-mappingssparql) in Mondo.
- See the [Pull Request here](https://github.com/monarch-initiative/mondo/pull/7472) where the QC check failed.

<img width="1039" alt="image" src="https://github.com/OBOAcademy/obook/assets/6722114/815b12ba-30b8-4c99-9e96-72781ab4ef40">

??? Query
??? Query qc-exact-synonyms-non-exact-mappings.sparql

```
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
Expand Down
Loading

0 comments on commit 68936b0

Please sign in to comment.