You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To stimulate dataset providers to improve their dataset descriptions in terms of completeness of properties, a rating system is proposed.
The dataset descriptions are rated with 1 to 5 stars, depending on the content (presence of properties) of the dataset description:
Each dataset description that has the required license, title and publisher gets a ☆ rating
If a dataset description also has a description and distribution, then the dataset description receives a ☆☆ rating
If a dataset description also has a creator and landingPage, the dataset description will receive a ☆☆☆ rating
If a dataset description also has a created, modified/updated and/or issued/published date, the dataset description will receive a ☆☆☆☆ rating
If a dataset description also has a language, source, keyword, spatial and/or temporal, the dataset description will receive a ☆☆☆☆☆ rating
This method does not (yet):
promote multi-language content
evaluate the quality of the content (eg. is the description understandable, does the contentURL of the distribution exist, is it linked data?), the method just evaluates based on quantity
not all schema:Dataset properties as defined in Requirements for datasets are evaluated, nor are the schema:DataDownload (distribution) properties
The rating for each of the dataset properties (both schema.org and DCAT):
The following aggregation query shows the number of datasets with a specific number of stars:
PREFIX schema: <http://schema.org/>
SELECT ?rating (COUNT(*) AS ?datasets_with_rating) WHERE {
?dataset schema:contentRating ?rating .
} GROUP BY ?rating
Output on 21-2-2023:
rating
datasets_with_rating
"☆"
"279"^^xsd:integer
"☆☆"
"376"^^xsd:integer
"☆☆☆"
"11"^^xsd:integer
"☆☆☆☆"
"9"^^xsd:integer
"☆☆☆☆☆"
"111"^^xsd:integer
The text was updated successfully, but these errors were encountered:
Suggestion for the demonstrator: make the stars a link to an explanation page so the user can read why the dataset got that specific number of stars, and what can be done to acquire more stars. Easiest if to make 5 static pages. Somewhat harder is to make a datasetdescription specific page.
to call this rating completeness (volledigheid) instead of quality (kwaliteit)
visualise this as a completion bar instead of stars to nudge publishers to complete their dataset description; @eddeheerna will ask a UX designer to come up with a good solution
show ‘dataset description completeness’ next to ‘dataset quality’ (which we cannot rate as of yet).
Rating
To stimulate dataset providers to improve their dataset descriptions in terms of completeness of properties, a rating system is proposed.
The dataset descriptions are rated with 1 to 5 stars, depending on the content (presence of properties) of the dataset description:
This method does not (yet):
The rating for each of the dataset properties (both schema.org and DCAT):
Construction
The rating of a dataset description in stored in a separate graph https://data.netwerkdigitaalerfgoed.nl/registry/description_ratings with the property
schema:contentRating
.The graph is constucted using 5 SPARQL INSERT queries:
☆☆☆☆☆ rating
☆☆☆☆ rating
☆☆☆ rating
☆☆ rating
☆ rating
TODO
Selection
The rating can be used for sorting and to show the rating (in the demonstrator), with a query like:
The following aggregation query shows the number of datasets with a specific number of stars:
Output on 21-2-2023:
The text was updated successfully, but these errors were encountered: