-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Purpose, motivation and implementation #1
Comments
How?FormatThe standard for schema on the web is RDF. That is what is used by schema.org and Bioschemas. Bioschemas is probably the one we should be following most closely. An example schema is Bioschema Dataset, and an example record is {
"@context": "https://schema.org/",
"@type": "Dataset",
"http://purl.org/dc/terms/conformsTo": { "@type": "CreativeWork", "@id": "https://bioschemas.org/profiles/Dataset/1.0-RELEASE" },
"@id": "https://doi.org/10.5281/zenodo.5743204",
"identifier": "10.5281/zenodo.5743204",
"name": "RDF version of the data from Choi, JS. et al. Towards a generalized toxicity prediction model for oxide nanomaterials using integrated data from different sources (2018)",
"description": "This is an RDFied version of the dataset published in Choi, JS., Ha, M.K., Trinh, T.X. et al. Towards a generalized toxicity prediction model for oxide nanomaterials using integrated data from different sources. Sci Rep 8, 6110 (2018). The original dataset publication DOI: https://doi.org/10.1038/s41598-018-24483-z. The Original publication authors: Jang-Sik Choi, My Kieu Ha, Tung Xuan Trinh, Tae Hyun Yoon & Hyung-Gi Byun",
"license": "https://creativecommons.org/licenses/by/4.0/legalcode",
"url": "https://zenodo.org/record/5743204",
"keywords": "oxide, nanomaterial, toxicity, prediction",
"creator": [
{
"@type": "Organization",
"name": "NanoSolveIT"
}
],
"datePublished": "2021-11-30",
"citation": { "@type": "CreativeWork", "@id": "https://doi.org/10.1038/s41598-018-24483-z", "name": "Towards a generalized toxicity prediction model for oxide nanomaterials using integrated data from different sources" }
} Other use casesThis all very well, but how does this map to relational databases like the ones typically used for data indexing? In a general sense not very well. However, the reverse mapping, from SQL DB to RDF is more straightforward. If we're playing mostly in the relational DB/SQL space and so want the RDF mapping for interoperability with the wider world then that will affect how complex we let the schemas become. Or we have a strict hierarchy of schema, with a tighter definition at the bottom, which is interoperable with SQL, and higher level schema with more freedom that allow for more connectivity. |
Thanks for providing these details and context @aidanheerdegen (Possibly) relevant climate-data examples
|
First up, props to @dougiesquire for starting the ball rolling.
What is the purpose of this repo?
Central location for all schema information for the ACCESS-NRI organisation.
Why?
Consistent approach across ACCESS-NRI organisation
Improves productivity as there is only one source of truth to find schema. Reduces the barrier for those new to the subject area who are not in a position to create their own schema due to lack of background knowledge.
Also naturally leads to interoperability: if everyone uses the same schema they re-use and connect with existing schema, which get the same connectivity "for free".
Such interconnected schema enables building knowledge graphs. A knowledge graph, or semantic network, is a graph based representation of the connections between objects contained in the schema. A knowledge graph can facilitate traversing data in novel ways that were previously unknown.
Knowledge graphs are a sort of ad hoc ontology.
Discoverability
Adding schema to webpages in json-ld format promotes discovery. It is the standard for semantic searching and cataloguing on the web.
This can lead to connections with other data providers, which adds value with little specific effort.
The text was updated successfully, but these errors were encountered: