Skip to content

Commit

Permalink
Add patterns
Browse files Browse the repository at this point in the history
  • Loading branch information
ckindermann committed Apr 17, 2024
1 parent 0b850de commit 535d2cf
Show file tree
Hide file tree
Showing 17 changed files with 1,051 additions and 15 deletions.
28 changes: 28 additions & 0 deletions docs/odk-workflows/ManageAutomatedTest.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
## Constraint violation checks

We can define custom checks using [SPARQL](https://www.w3.org/TR/rdf-sparql-query/). SPARQL queries define bad modelling patterns (missing labels, misspelt URIs, and many more) in the ontology. If these queries return any results, then the build will fail. Custom checks are designed to be run as part of GitHub Actions Continuous Integration testing, but they can also run locally.

### Steps to add a constraint violation check:

1. Add the SPARQL query in `src/sparql`. The name of the file should end with `-violation.sparql`. Please give a name that helps to understand which violation the query wants to check.
2. Add the name of the new file to odk configuration file `src/ontology/uberon-odk.yaml`:
1. Include the name of the file (without the `-violation.sparql` part) to the list inside the key `custom_sparql_checks` that is inside `robot_report` key.
1. If the `robot_report` or `custom_sparql_checks` keys are not available, please add this code block to the end of the file.

``` yaml
robot_report:
release_reports: False
fail_on: ERROR
use_labels: False
custom_profile: True
report_on:
- edit
custom_sparql_checks:
- name-of-the-file-check
```
3. Update the repository so your new SPARQL check will be included in the QC.
```shell
sh run.sh make update_repo
```

8 changes: 4 additions & 4 deletions docs/odk-workflows/RepoManagement.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,9 +45,9 @@ Note: our ODK file should only have one `import_group` which can contain multipl
1. Add an import statement to your `src/ontology/gcbo-edit.owl` file. We suggest to do this using a text editor, by simply copying an existing import declaration and renaming it to the new ontology import, for example as follows:
```
...
Ontology(<https://github.com/bmir-radx/gcbo/gcbo.owl>
Import(<https://github.com/bmir-radx/gcbo/gcbo/imports/ro_import.owl>)
Import(<https://github.com/bmir-radx/gcbo/gcbo/imports/go_import.owl>)
Ontology(<https://github.com/bmir-radx/gcbo//gcbo.owl>
Import(<https://github.com/bmir-radx/gcbo//gcbo/imports/ro_import.owl>)
Import(<https://github.com/bmir-radx/gcbo//gcbo/imports/go_import.owl>)
...
```
2. Add your imports redirect to your catalog file `src/ontology/catalog-v001.xml`, for example:
Expand All @@ -68,7 +68,7 @@ Import(<http://purl.obolibrary.org/obo/gcbo/imports/go_import.owl>)
in your editors file (the ontology) and

```
<uri name="https://github.com/bmir-radx/gcbo/gcbo/imports/go_import.owl" uri="imports/go_import.owl"/>
<uri name="https://github.com/bmir-radx/gcbo//gcbo/imports/go_import.owl" uri="imports/go_import.owl"/>
```

in your catalog, tools like `robot` or Protégé will recognize the statement
Expand Down
5 changes: 5 additions & 0 deletions docs/odk-workflows/RepositoryFileStructure.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,11 @@ These are the current imports in GCBO

| Import | URL | Type |
| ------ | --- | ---- |
| hp | http://purl.obolibrary.org/obo/hp.owl | None |
| mondo | http://purl.obolibrary.org/obo/mondo.owl | None |
| nbo | http://purl.obolibrary.org/obo/nbo.owl | None |
| ncit | http://purl.obolibrary.org/obo/ncit.owl | None |
| pato | http://purl.obolibrary.org/obo/pato.owl | None |
| symp | http://purl.obolibrary.org/obo/symp.owl | None |

## Components
Expand Down
6 changes: 3 additions & 3 deletions docs/odk-workflows/components.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,20 +17,20 @@ components:
3) Add the component to your catalog file (src/ontology/catalog-v001.xml)

```
<uri name="https://github.com/bmir-radx/gcbo/gcbo/components/your-component-name.owl" uri="components/your-component-name.owl"/>
<uri name="https://github.com/bmir-radx/gcbo//gcbo/components/your-component-name.owl" uri="components/your-component-name.owl"/>
```

4) Add the component to the edit file (src/ontology/gcbo-edit.obo)
for .obo formats:

```
import: https://github.com/bmir-radx/gcbo/gcbo/components/your-component-name.owl
import: https://github.com/bmir-radx/gcbo//gcbo/components/your-component-name.owl
```

for .owl formats:

```
Import(<https://github.com/bmir-radx/gcbo/gcbo/components/your-component-name.owl>)
Import(<https://github.com/bmir-radx/gcbo//gcbo/components/your-component-name.owl>)
```

5) Refresh your repo by running `sh run.sh make update_repo` - this should create a new file in src/ontology/components.
Expand Down
5 changes: 5 additions & 0 deletions docs/templates/dosdp.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# DOSDP documentation stub

Do not overwrite, contents will be generated automatically.


168 changes: 160 additions & 8 deletions src/ontology/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
# More information: https://github.com/INCATools/ontology-development-kit/

# Fingerprint of the configuration file when this Makefile was last generated
CONFIG_HASH= 804382052d77470f0f95278327d4c941b3eeb9a65d07e1292d9920ffcafe3aed
CONFIG_HASH= aab9cd01f6203c477ccf45b85b8d1d7e50985f2fac81dc46fbc65238ee145e65


# ----------------------------------------
Expand All @@ -19,9 +19,9 @@ CONFIG_HASH= 804382052d77470f0f95278327d4c941b3eeb9a65d07e1292d99
# these can be overwritten on the command line

OBOBASE= http://purl.obolibrary.org/obo
URIBASE= https://github.com/bmir-radx/gcbo
URIBASE= https://github.com/bmir-radx/gcbo/
ONT= gcbo
ONTBASE= https://github.com/bmir-radx/gcbo/gcbo
ONTBASE= https://github.com/bmir-radx/gcbo//gcbo
EDIT_FORMAT= owl
SRC = $(ONT)-edit.$(EDIT_FORMAT)
MAKE_FAST= $(MAKE) IMP=false PAT=false COMP=false MIR=false
Expand Down Expand Up @@ -53,10 +53,15 @@ OBODATE ?= $(shell date +'%d:%m:%Y %H:%M')
VERSION= $(TODAY)
ANNOTATE_ONTOLOGY_VERSION = annotate -V $(ONTBASE)/releases/$(VERSION)/$@ --annotation owl:versionInfo $(VERSION)
ANNOTATE_CONVERT_FILE = annotate --ontology-iri $(ONTBASE)/$@ $(ANNOTATE_ONTOLOGY_VERSION) convert -f ofn --output $@.tmp.owl && mv $@.tmp.owl $@
OTHER_SRC =
OTHER_SRC = $(PATTERNDIR)/definitions.owl
ONTOLOGYTERMS = $(TMPDIR)/ontologyterms.txt
EDIT_PREPROCESSED = $(TMPDIR)/$(ONT)-preprocess.owl
CONTEXT_FILE = config/context.json
PATTERNDIR= ../patterns
PATTERN_TESTER= dosdp validate -i
DOSDPT= dosdp-tools
PATTERN_RELEASE_FILES= $(PATTERNDIR)/definitions.owl $(PATTERNDIR)/pattern.owl


FORMATS = $(sort owl obo json owl)
FORMATS_INCL_TSV = $(sort $(FORMATS) tsv)
Expand All @@ -80,7 +85,7 @@ all: all_odk
all_odk: odkversion config_check test custom_reports all_assets

.PHONY: test
test: odkversion reason_test sparql_test robot_reports $(REPORTDIR)/validate_profile_owl2dl_$(ONT).owl.txt
test: odkversion dosdp_validation reason_test sparql_test robot_reports $(REPORTDIR)/validate_profile_owl2dl_$(ONT).owl.txt
echo "Finished running all tests successfully."

.PHONY: test
Expand Down Expand Up @@ -160,7 +165,7 @@ all_main: $(MAIN_FILES)
# ----------------------------------------


IMPORTS = symp
IMPORTS = hp mondo nbo ncit pato symp

IMPORT_ROOTS = $(patsubst %, $(IMPORTDIR)/%_import, $(IMPORTS))
IMPORT_OWL_FILES = $(foreach n,$(IMPORT_ROOTS), $(n).owl)
Expand Down Expand Up @@ -259,6 +264,7 @@ check_for_robot_updates:
ASSETS = \
$(IMPORT_FILES) \
$(MAIN_FILES) \
$(PATTERN_RELEASE_FILES) \
$(REPORT_FILES) \
$(SUBSET_FILES) \
$(MAPPING_FILES)
Expand Down Expand Up @@ -292,13 +298,15 @@ CLEANFILES=$(MAIN_FILES) $(SRCMERGED) $(EDIT_PREPROCESSED)
.PHONY: prepare_release
prepare_release: all_odk
rsync -R $(RELEASE_ASSETS) $(RELEASEDIR) &&\
mkdir -p $(RELEASEDIR)/patterns && cp -rf $(PATTERN_RELEASE_FILES) $(RELEASEDIR)/patterns &&\
rm -f $(CLEANFILES) &&\
echo "Release files are now in $(RELEASEDIR) - now you should commit, push and make a release \
on your git hosting site such as GitHub or GitLab"

.PHONY: prepare_initial_release
prepare_initial_release: all_assets
rsync -R $(RELEASE_ASSETS) $(RELEASEDIR) &&\
mkdir -p $(RELEASEDIR)/patterns && cp -rf $(PATTERN_RELEASE_FILES) $(RELEASEDIR)/patterns &&\
rm -f $(patsubst %, ./%, $(CLEANFILES)) &&\
cd $(RELEASEDIR) && git add $(RELEASE_ASSETS)

Expand Down Expand Up @@ -342,7 +350,7 @@ $(SIMPLESEED): $(SRCMERGED) $(ONTOLOGYTERMS)
echo "http://www.geneontology.org/formats/oboInOwl#SynonymTypeProperty" >> $@


ALLSEED = $(PRESEED) \
ALLSEED = $(PRESEED) $(TMPDIR)/all_pattern_terms.txt \


$(IMPORTSEED): $(ALLSEED) | $(TMPDIR)
Expand Down Expand Up @@ -404,6 +412,46 @@ IMP_LARGE=true # Global parameter to bypass handling of large imports
ifeq ($(strip $(MIR)),true)


## ONTOLOGY: hp
.PHONY: mirror-hp
.PRECIOUS: $(MIRRORDIR)/hp.owl
mirror-hp: | $(TMPDIR)
curl -L $(OBOBASE)/hp.owl --create-dirs -o $(TMPDIR)/hp-download.owl --retry 4 --max-time 200 && \
$(ROBOT) convert -i $(TMPDIR)/hp-download.owl -o $(TMPDIR)/$@.owl


## ONTOLOGY: mondo
.PHONY: mirror-mondo
.PRECIOUS: $(MIRRORDIR)/mondo.owl
mirror-mondo: | $(TMPDIR)
curl -L $(OBOBASE)/mondo.owl --create-dirs -o $(TMPDIR)/mondo-download.owl --retry 4 --max-time 200 && \
$(ROBOT) convert -i $(TMPDIR)/mondo-download.owl -o $(TMPDIR)/$@.owl


## ONTOLOGY: nbo
.PHONY: mirror-nbo
.PRECIOUS: $(MIRRORDIR)/nbo.owl
mirror-nbo: | $(TMPDIR)
curl -L $(OBOBASE)/nbo.owl --create-dirs -o $(TMPDIR)/nbo-download.owl --retry 4 --max-time 200 && \
$(ROBOT) convert -i $(TMPDIR)/nbo-download.owl -o $(TMPDIR)/$@.owl


## ONTOLOGY: ncit
.PHONY: mirror-ncit
.PRECIOUS: $(MIRRORDIR)/ncit.owl
mirror-ncit: | $(TMPDIR)
curl -L $(OBOBASE)/ncit.owl --create-dirs -o $(TMPDIR)/ncit-download.owl --retry 4 --max-time 200 && \
$(ROBOT) convert -i $(TMPDIR)/ncit-download.owl -o $(TMPDIR)/$@.owl


## ONTOLOGY: pato
.PHONY: mirror-pato
.PRECIOUS: $(MIRRORDIR)/pato.owl
mirror-pato: | $(TMPDIR)
curl -L $(OBOBASE)/pato.owl --create-dirs -o $(TMPDIR)/pato-download.owl --retry 4 --max-time 200 && \
$(ROBOT) convert -i $(TMPDIR)/pato-download.owl -o $(TMPDIR)/$@.owl


## ONTOLOGY: symp
.PHONY: mirror-symp
.PRECIOUS: $(MIRRORDIR)/symp.owl
Expand Down Expand Up @@ -456,6 +504,100 @@ custom_reports: $(EDIT_PREPROCESSED) | $(REPORTDIR)
ifneq ($(SPARQL_EXPORTS_ARGS),)
$(ROBOT) query -f tsv --use-graphs true -i $< $(SPARQL_EXPORTS_ARGS)
endif
# ----------------------------------------
# DOSDP Templates/Patterns
# ----------------------------------------

PAT=true # Global parameter to bypass pattern generation
ALL_PATTERN_FILES=$(wildcard $(PATTERNDIR)/dosdp-patterns/*.yaml)
ALL_PATTERN_NAMES=$(strip $(patsubst %.yaml,%, $(notdir $(wildcard $(PATTERNDIR)/dosdp-patterns/*.yaml))))

PATTERN_CLEAN_FILES=../patterns/all_pattern_terms.txt \
$(DOSDP_OWL_FILES_DEFAULT) $(DOSDP_TERM_FILES_DEFAULT) \


# Note to future generations: prepending ./ is a safety measure to ensure that
# the environment does not malicously set `PATTERN_CLEAN_FILES` to `\`.
.PHONY: pattern_clean
pattern_clean:
rm -f $(PATTERN_CLEAN_FILES)

.PHONY: patterns
patterns dosdp:
echo "Validating all DOSDP templates"
$(MAKE) dosdp_validation
echo "Building $(PATTERNDIR)/definitions.owl"
$(MAKE) $(PATTERNDIR)/pattern.owl $(PATTERNDIR)/definitions.owl

# DOSDP Template Validation

$(TMPDIR)/pattern_schema_checks: $(ALL_PATTERN_FILES) | $(TMPDIR)
$(PATTERN_TESTER) $(PATTERNDIR)/dosdp-patterns/ && touch $@

.PHONY: pattern_schema_checks
pattern_schema_checks dosdp_validation: $(TMPDIR)/pattern_schema_checks

.PHONY: update_patterns
update_patterns: download_patterns
if [ -n "$$(find $(TMPDIR) -type f -path '$(TMPDIR)/dosdp/*.yaml')" ]; then cp -r $(TMPDIR)/dosdp/*.yaml $(PATTERNDIR)/dosdp-patterns; fi

# This command is a workaround for the absence of -N and -i in wget of alpine (the one ODK depend on now).
# It downloads all patterns specified in external.txt
.PHONY: download_patterns
download_patterns:
if [ $(PAT) = true ]; then rm -f $(TMPDIR)/dosdp/*.yaml.1 || true; fi
if [ $(PAT) = true ] && [ -s $(PATTERNDIR)/dosdp-patterns/external.txt ]; then wget -i $(PATTERNDIR)/dosdp-patterns/external.txt --backups=1 -P $(TMPDIR)/dosdp; fi
if [ $(PAT) = true ]; then rm -f $(TMPDIR)/dosdp/*.yaml.1 || true; fi

$(PATTERNDIR)/dospd-patterns/%.yml: download_patterns
if [ $(PAT) = true ] ; then if cmp -s $(TMPDIR)/dosdp-$*.yml $@ ; then echo "DOSDP templates identical."; else echo "DOSDP templates different, updating." &&\
cp $(TMPDIR)/dosdp-$*.yml $@; fi; fi


# DOSDP Template: Pipelines
# Each pipeline gets its own directory structure

# DOSDP default pipeline

DOSDP_TSV_FILES_DEFAULT = $(wildcard $(PATTERNDIR)/data/default/*.tsv)
DOSDP_OWL_FILES_DEFAULT = $(patsubst %.tsv, $(PATTERNDIR)/data/default/%.ofn, $(notdir $(wildcard $(PATTERNDIR)/data/default/*.tsv)))
DOSDP_TERM_FILES_DEFAULT = $(patsubst %.tsv, $(PATTERNDIR)/data/default/%.txt, $(notdir $(wildcard $(PATTERNDIR)/data/default/*.tsv)))
DOSDP_PATTERN_NAMES_DEFAULT = $(strip $(patsubst %.tsv,%, $(notdir $(wildcard $(PATTERNDIR)/data/default/*.tsv))))

$(DOSDP_OWL_FILES_DEFAULT): $(EDIT_PREPROCESSED) $(DOSDP_TSV_FILES_DEFAULT) $(ALL_PATTERN_FILES)
if [ $(PAT) = true ] && [ "${DOSDP_PATTERN_NAMES_DEFAULT}" ]; then $(DOSDPT) generate --catalog=$(CATALOG) \
--infile=$(PATTERNDIR)/data/default/ --template=$(PATTERNDIR)/dosdp-patterns --batch-patterns="$(DOSDP_PATTERN_NAMES_DEFAULT)" \
--ontology=$< --obo-prefixes=true --prefixes=config/prefixes.yaml --outfile=$(PATTERNDIR)/data/default; fi


# Generate template file seeds

## Generate template file seeds
$(PATTERNDIR)/data/default/%.txt: $(PATTERNDIR)/dosdp-patterns/%.yaml $(PATTERNDIR)/data/default/%.tsv
if [ $(PAT) = true ]; then $(DOSDPT) terms --infile=$(word 2, $^) --template=$< --obo-prefixes=true --outfile=$@; fi


# Generating the seed file from all the TSVs. If Pattern generation is deactivated, we still extract a seed from definitions.owl
$(TMPDIR)/all_pattern_terms.txt: $(DOSDP_TERM_FILES_DEFAULT) $(TMPDIR)/pattern_owl_seed.txt
if [ $(PAT) = true ]; then cat $^ | sort | uniq > $@; else $(ROBOT) query --use-graphs true -f csv -i $(PATTERNDIR)/definitions.owl \
--query ../sparql/terms.sparql $@; fi

$(TMPDIR)/pattern_owl_seed.txt: $(PATTERNDIR)/pattern.owl
if [ $(PAT) = true ]; then $(ROBOT) query --use-graphs true -f csv -i $< --query ../sparql/terms.sparql $@; fi

# Pattern pipeline main targets: the generated OWL files

# Create pattern.owl, an ontology of all DOSDP patterns
$(PATTERNDIR)/pattern.owl: $(ALL_PATTERN_FILES)
if [ $(PAT) = true ]; then $(DOSDPT) prototype --obo-prefixes true --template=$(PATTERNDIR)/dosdp-patterns --outfile=$@; fi

# Generating the individual pattern modules and merging them into definitions.owl
$(PATTERNDIR)/definitions.owl: $(DOSDP_OWL_FILES_DEFAULT)
if [ $(PAT) = true ] && [ "${DOSDP_PATTERN_NAMES_DEFAULT}" ] && [ $(PAT) = true ]; then $(ROBOT) merge $(addprefix -i , $^) \
annotate --ontology-iri $(ONTBASE)/patterns/definitions.owl --version-iri $(ONTBASE)/releases/$(TODAY)/patterns/definitions.owl \
--annotation owl:versionInfo $(VERSION) -o definitions.ofn && mv definitions.ofn $@; fi



# ----------------------------------------
# Release artefacts: export formats
Expand Down Expand Up @@ -561,7 +703,7 @@ public_release:
# General Validation
# ----------------------------------------
TSV=
ALL_TSV_FILES=
ALL_TSV_FILES=$(DOSDP_TSV_FILES_DEFAULT)

validate-tsv: $(TSV) | $(TMPDIR)
for FILE in $< ; do \
Expand Down Expand Up @@ -600,6 +742,7 @@ update_docs:
# the *DIR variables.
.PHONY: clean
clean:
$(MAKE) pattern_clean
for dir in $(MIRRORDIR) $(TMPDIR) $(UPDATEREPODIR) ; do \
reldir=$$(realpath --relative-to=$$(pwd) $$dir) ; \
case $$reldir in .*|"") ;; *) rm -rf $$reldir/* ;; esac \
Expand Down Expand Up @@ -638,6 +781,15 @@ Imports management:
* no-mirror-refresh-%: Refresh a single import without updating the mirror, i.e. refresh-go will refresh 'imports/go_import.owl'.
* mirror-%: Refresh a single mirror.

DOSDP templates
* dosdp: Run the DOSDP patterns pipeline: Run tests, then build OWL files from the tables.
* patterns: Alias of the 'dosdp' command
* pattern_clean: Delete all temporary pattern files
* dosdp_validation: Run all validation checks on DOSDP template files and tables
* pattern_schema_checks: Alias of the 'dosdp_validation' command
* update_patterns: Pull updated patterns listed in dosdp-patterns/external.txt
* dosdp-matches-%: Run the DOSDP matches/query pipeline as configured in your gcbo-odk.yaml file.

Editor utilities:
* validate_idranges: Make sure your ID ranges file is formatted correctly
* normalize_src: Load and save your gcbo-edit file after you to make sure its serialised correctly
Expand Down
2 changes: 2 additions & 0 deletions src/ontology/config/prefixes.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
SCHEMA: "https://schema.org/"
BMIR: "https://github.com/bmir-radx/gcbo/"
1 change: 1 addition & 0 deletions src/patterns/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# DOSDP patterns - editors docs
1 change: 1 addition & 0 deletions src/patterns/data/default/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Documentation of the Default DOSDP Pipeline
Loading

0 comments on commit 535d2cf

Please sign in to comment.