In this tutorial we will look at how to deploy a SPARQL endpoint through Ontop. A SPARQL endpoint is an HTTP endpoint compliant to the SPARQL HTTP protocol, that can be used to execute queries against the VKG.
In Ontop, there are two ways of deploying a SPARQL endpoint:
- Trough command line interface, or
- Trough docker image.
In this tutorial we will focus on the first method, but you can experiment yourselves with the second by following the instructions contained in the provided link.
For this tutorial, you need a working java installation. If your machine does not have java installed, please download and install it.
Ontop ships a shell script (ontop
for *nix) and a bat file (ontop.bat
for Windows) exposing the core functionality and several utilities. The CLI is an easy way to get the system quickly set-up, test for correct execution, and query or materialize VKG graphs as needed. The commands available through the CLI are the following:
$ ./ontop help
usage: ontop <command> [ <args> ]
Commands are:
--version Show version of ontop
bootstrap Bootstrap ontology and mapping from the database
endpoint Start a SPARQL endpoint powered by Ontop
extract-db-metadata Extract the DB metadata and serialize it into an output JSON file
help Display help information
materialize Materialize the RDF graph exposed by the mapping and the OWL ontology
query Query the RDF graph exposed by the mapping and the OWL ontology
validate Validate Ontology and Mappings
mapping Manipulate mapping files
In this tutorial we focus on the ontop endpoint
command, orchestrating the deployment of a SPARQL endpoint.
First, you have to download the latest Ontop CLI zip from our download pages ([Github](https://github.com/ontop/ontop/releases) or [Sourceforge](https://sourceforge.net/projects/ontop4obda/files/)). Unzip its content in the `deployment` folder. Open the command line terminal and `cd` to that folder. For Windows use the `ontop.bat` file, for Linux and OS X use the `ontop` file. To check whether everything works fine, try running the `help` command:
deployment/$ ./ontop help
usage: ontop <command> [ <args> ]
Commands are:
--version Show version of ontop
bootstrap Bootstrap ontology and mapping from the database
endpoint Start a SPARQL endpoint powered by Ontop
extract-db-metadata Extract the DB metadata and serialize it into an output JSON file
help Display help information
materialize Materialize the RDF graph exposed by the mapping and the OWL ontology
query Query the RDF graph exposed by the mapping and the OWL ontology
validate Validate Ontology and Mappings
mapping Manipulate mapping files
JDBC drivers are software implemented by third parties (often the same developers of the database system) that handle the interaction with the database in their own proprietary protocols. In a typical deployment of Ontop, you will need to manually download the JDBC drivers for your database management system, and put them into the deployment/jdbc
directory. For PostgreSQL and this tutorial, you don't need to download the driver, as it can be simply copied from the jdbc
folder in the root directory of the project, e.g.:
deployment/$ cp ../jdbc/postgresql-42.2.8.jar jdbc/
Most commands below require or accept as input a property file. This is where you will specify the JDBC connection parameters, as well as other Ontop-related advanced parameters. For this tutorial, connection parameters will be provided through property file deployment/connection/connection.properties
, which has the following content:
jdbc.url=jdbc\:postgresql\://ontobras1.cw6ltdje1jrb.eu-west-3.rds.amazonaws.com/postgres
jdbc.driver=org.postgresql.Driver
jdbc.user=ontobras
jdbc.password=ontoptutorial
ontop.queryOntologyAnnotation=true
ontop.enableFactExtractionWithTBox=true
The first four lines specify the connection parameters to the database. The last two lines are additional (and, optional) Ontop-related advanced parameters:
ontop.queryOntologyAnnotation=true
: this option enables queries over OWL Annotation Properties;ontop.enableFactExtractionWithTBox=true
: this option allows to pose queries over the ontology structure.
ontop endpoint
deploys a SPARQL endpoint locally at the address /sparql
, and by default on the port 8080
. To start the endpoint with the ontology, mappings, and lenses you have produced with Ontopic Studio, you can simply run the following command from within the deployment
folder:
deployment/$ ./ontop endpoint -t ../vkg/ontology.ttl \
-m ../vkg/mapping.ttl \
-p connection/connection.properties \
-v ../vkg/lenses.json \
--portal=connection/predefined-queries.toml
The command is self-explanatory, apart from the last (optional) argument --portal
, which we will explain in a minute. If everything went fine, the endpoint should be up and listening for requests at the address http://localhost:8080/sparql
. At this point, you should be able to send POST
or GET
requests carrying the SPARQL query to evaluate (as per SPARQL HTTP protocol).
For instance, consider the first "campground query" you wrote in the modeling part of this tutorial:
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX schema: <http://schema.org/>
SELECT * WHERE {
?h a schema:Campground ; schema:name ?name .
FILTER (lang(?name) = 'en')
OPTIONAL {
?h geo:defaultGeometry / geo:asWKT ?geometry .
}
}
This query can be evaluated against our SPARQL endpoint through an HTTP request, for instance by using curl:
$ curl --request POST \
--url http://localhost:8080/sparql \
--header "accept: application/json" \
--header "content-type: application/sparql-query" \
--data "PREFIX geo: <http://www.opengis.net/ont/geosparql#> PREFIX schema: <http://schema.org/> SELECT * WHERE {?h a schema:Campground ; schema:name ?name . FILTER (lang(?name) = 'en') OPTIONAL {?h geo:defaultGeometry / geo:asWKT ?geometry .}}"
Running the command above (assuming you have curl
installed), will return the following JSON answer:
{
"head" : {
"vars" : [
"h",
"name",
"geometry"
]
},
"results" : {
"bindings" : [
{
"h" : {
"type" : "uri",
"value" : "http://mydata.example.org/source2-hotels/2CDD759082854DA155AB3FE9B5DEC46C"
},
"name" : {
"xml:lang" : "en",
"type" : "literal",
"value" : "Stadtcamping Glorenza/Glurns"
}
},
{
"h" : {
"type" : "uri",
"value" : "http://mydata.example.org/source2-hotels/724FDB8ACFC811D1BA6B444553540000"
},
"name" : {
"xml:lang" : "en",
"type" : "literal",
"value" : "Camping Trafoi"
}
},
{
"h" : {
...
Alternatively, you can exploit one of the the many SPARQL client libraries available for many programming and data analysis environments, such as rdf4j for Java. In this tutorial we won't explore these options, but you are invited to try out for yourselves!
Other than the HTTP endpoint, ontop endpoint
provides an web portal that can be used to query the system. The portal is located under http://localhost:8080
:
Under Example Queries
tab, you can select and execute any of the three queries you wrote in the modeling part of this tutorial. How come? You actually provided them as a parameter when you launched the endpoint, through the --portal=connection/predefined-queries.toml
argument. The queries file predefined-queries.toml
has the following content:
title="OntoBRAS SPARQL Portal"
[[tabGroups]]
name="Example Queries"
[[tabGroups.tabs]]
name="Campground Example Query 1"
query="""
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX schema: <http://schema.org/>
SELECT * WHERE {
?h a schema:Campground ; schema:name ?name .
FILTER (lang(?name) = 'en')
OPTIONAL {
?h geo:defaultGeometry / geo:asWKT ?geometry .
}
}
"""
[[tabGroups.tabs]]
name="Campground Example Query 2"
query="""
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX schema: <http://schema.org/>
PREFIX voc: <http://tutorial.example.org/voc#>
SELECT * WHERE {
?h a schema:Campground ; schema:name ?name ; schema:containedInPlace ?municipality .
?municipality a voc:Municipality ; schema:name ?municipalityName .
FILTER (lang(?name) = 'en' && lang(?municipalityName) = 'en')
OPTIONAL {
?h geo:defaultGeometry / geo:asWKT ?geometry .
}
}
"""
...
Below the query box, you have different possible options for the visualizations of your results. For instance, try running Campground Example Query 1
(by clicking on the "play like" button on the top left of the query box).
After a short wait, you will get results returned in tabular form. Since our results contain Geospatial information, they can be conveniently displayed on a map by clicking on the Geo
button at the bottom right of the interface:
ontop endpoint
offers lots of advanced options for deploying SPARQL endpoints. The full list of options is available through the ontop help
command:
deployment$ ./ontop help endpoint
Now that we have a running SPARQL endpoint, let's write some SPARQL queries and evaluate them against it! You can take inspiration from the three queries under the Example Queries
tab, and evaluate your creations under the Playground
tab. For formulating the queries below, I suggest referring to the ontology diagram at the end of the README.md file.
- Retrieve all lodging businesses, their name in English, their positions (if available), and the administrative area they are contained in.
- Let's try to query the annotations contained in the ontology. Specifically, retrieve the
rdfs:comment
annotation relative to classLodgingBusiness
(requiresontop.queryOntologyAnnotation=true
option). For therdfs
prefix, usehttp://www.w3.org/2000/01/rdf-schema#
. - Now, what about querying the ontology structure? Try writing a SPARQL query retrieving all kinds of lodging businesses (i.e., campgrounds, hotels, etc.). Hint: use the
rdfs:subClassOf
property, encoding the ISA relation between classes. - Retrieve the names of campgrounds that do not have a geospatial coordinate in our data. Hint: Have a look at the BOUND SPARQL function (and, negation!).
- Retrieve the number of hotels in our data. Hint: use the COUNT aggregation function.
- Retrieve the names of the 10 municipalities containing the highest number of hotels, together with the number of hotels they contain. Such list should be ordered by number of hotels (descending order).
- Out of the "top 10 municipalities" from the query above, how many do not have an associated geospatial coordinate in our data? Retrieve such number. Hint: Use SPARQL subqueries.
- Retrieve the number of instances in each subclass of lodging business.
- Out of the subclasses of lodging businesses, what are those having less than 3000 instances?
- On a map, show the locations of lodging businesses in the municipality of
"Merano/Meran"@en
, their names, and in such a way that each kind of lodging business is associated to a different color (e.g., green for campgrounds and blue for hotels). Hint: use special variables?posLabel
and?posColor
to specify respectively the labels and the colors to be displayed on the map. Specific colors can be bound to specific lodging businesses by using theBIND
function, e.g.,BIND("chlorophyll,0.5" AS ?posColor)
. For specifying the actual colors, other thanchlorophyll
you can useviridis
andjet
.
The vkg/complete/
folder contains a complete VKG setting for the lodging scenario. Try loading it, and play with the queries provided in the .toml
file!