Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation rework & mu.semte.ch import #5

Open
wants to merge 57 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
6b90873
Added reactive-programming and why-semantic-tech
Denperidge Jan 23, 2023
645a3a1
Added experimentation, fixed Aad's name (sorry)
Denperidge Jan 23, 2023
cfe2cbe
Imported oslo2 writeup
Denperidge Jan 23, 2023
bb26b84
Renamed explanations to discussions
Denperidge Jan 24, 2023
bf38b01
Imported mu.semte.ch as the ideal playground
Denperidge Jan 24, 2023
2086c07
Added reference navigation to root README
Denperidge Jan 24, 2023
df1c275
Imported Semantic Micro Services, Why Bother?
Denperidge Jan 24, 2023
359252f
Imported Find your way through the stack
Denperidge Jan 24, 2023
bd7ab3d
Added reference/documentation
Denperidge Jan 24, 2023
ccf85af
Imported how-to, fixed incorrect adapted location
Denperidge Jan 24, 2023
99980db
Added representing-logged-in-users
Denperidge Jan 25, 2023
0c6b603
Renamed reference to references
Denperidge Jan 25, 2023
7bdbc3b
Imported How mu.semte.ch can help you beat the 10%
Denperidge Jan 26, 2023
63c7c5f
Fixed link to smaller & readable code
Denperidge Jan 26, 2023
f304005
Writeup nav & imported mu.semte.ch at DockerCon EU
Denperidge Jan 26, 2023
0058f86
Imported Publishing ... Docker multi-stage builds
Denperidge Jan 26, 2023
8411177
Imported Hello MacOS
Denperidge Jan 26, 2023
a09e683
Imported mu.semte.ch at DeveloperWeek
Denperidge Jan 26, 2023
b86dad8
Imported On sharing authorization
Denperidge Jan 26, 2023
815451c
Imported On microservice reuse and authorization
Denperidge Jan 26, 2023
36cf778
Added archive README, archived Get to know mu-cl-
Denperidge Jan 26, 2023
bc2f716
Archived The delta service and its benefits
Denperidge Jan 26, 2023
2361366
Archived Auto-expanding uploaded semantic files
Denperidge Jan 26, 2023
e98389b
Imported Thoughts on how a distributed SPARQL endp
Denperidge Jan 26, 2023
11b49ff
Imported Publications from mu.semte.ch/components/
Denperidge Jan 26, 2023
4eee2da
Imported mu.semte.ch/who/
Denperidge Jan 26, 2023
2474862
Imported mu.semte.ch/about/
Denperidge Jan 26, 2023
0f0b07f
Changed link from blogpost to documentation
Denperidge Jan 26, 2023
e06005a
renamed how-to to how-tos
Denperidge Jan 26, 2023
320d497
Imported How to build a microservice template
Denperidge Mar 9, 2023
0e3e875
Fixed template documentation formatting
Denperidge Mar 9, 2023
4db6dea
Documented helper functions, added header table
Denperidge Mar 9, 2023
3cc218c
Add How-to for troubleshooting slow starting containers
piemonkey Sep 13, 2023
ac6288f
Merge pull request #1 from piemonkey/cpu-troubleshoot
Denperidge Sep 13, 2023
c0135b6
Small README.md changes, documented documentation
Denperidge Sep 22, 2023
5a6e2d1
README update, fixed typos + expanded doc-structure
Denperidge Oct 5, 2023
2079fce
Imported masterclass 01 - How and why pt1-2
Denperidge Oct 9, 2023
8c4d641
Merge branch 'master' into masterclass
Denperidge Oct 14, 2023
9d88d52
Imported masterclass 02 - A shared foundation pt1
Denperidge Oct 14, 2023
fee309f
Imported masterclass 04 - Templates and conventions*
Denperidge Oct 15, 2023
5a63787
Imported masterclass 05 - Common microservices pt1-3
Denperidge Oct 15, 2023
c6ae872
Added extra notes from blog post to TODO.md
Denperidge Oct 15, 2023
9ef0205
Deprecations & refactors
Denperidge Oct 16, 2023
797e070
Writeups structure refactor
Denperidge Oct 16, 2023
b607190
Merged building-a-template into creating-templates
Denperidge Oct 16, 2023
6b9c9cb
Merged naming-conventions into project categories
Denperidge Oct 16, 2023
f4324f5
Slight README changes, renamed references-reference
Denperidge Oct 16, 2023
f419e2a
Deprecations & refactors
Denperidge Oct 16, 2023
5c15fd8
Merge pull request #2 from Denperidge-Redpencil/masterclass
Denperidge Oct 16, 2023
8e913f4
why-semantic-*: deprecate & import, README update
Denperidge Oct 16, 2023
91f4b8f
Added explainers with getting started
Denperidge Oct 18, 2023
584d5a2
Design philosophy polish
Denperidge Oct 18, 2023
f6e345a
Merge branch 'deprecations-and-refactors'
Denperidge Oct 18, 2023
c19996e
Fixed outdated relative links
Denperidge Oct 19, 2023
64b7fd5
Merge branch 'master' of https://github.com/Denperidge-Redpencil/project
Denperidge Oct 19, 2023
b82bc79
Update quickstart-writing-documentation.md
Denperidge Oct 28, 2023
fd84cd5
Fixed typo
Denperidge Oct 28, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 64 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,64 @@
This repository is used to track issues that apply cross-services in the full semantic.works stack.
# Semantic.works (mu-semtech)
This repository is used to store information and track issues that apply cross-services in the full semantic.works stack.

## Getting started
If you do not know where to begin, check out any/all of the following documents:
1. Read about why and how the semantic.works stack works: [discussions - design philosophy](docs/discussions/design-philosophy.md)
2. Recognising our different project types and how they help you: [discussions - project categories](docs/discussions/project-categories.md)
3. Learn how to [create a full-fledged application](docs/how-tos/creating-applications.md) or a [microservice to support it](docs/how-tos/creating-microservices.md)
4. Learn how to ideally [deploy your applications](docs/how-tos/deploying-applications.md)


## How-to
### Create...
- [Applications](docs/how-tos/creating-applications.md)
- [Microservices](docs/how-tos/creating-microservices.md)
- [Templates](#create-1)

### Development
- [Add services to your project](docs/how-tos/adding-services-to-your-project.md)
- [Deploy applications](docs/how-tos/deploying-applications.md)
- [Quickstart writing documentation](docs/how-tos/quickstart-writing-documentation.md)

### Troubleshooting
- [Troubleshooting - Slow starting containers using 100% CPU](docs/how-tos/troubleshooting---slow-starting-containers.md)

## Tutorials
### Create...
- [Templates](docs/tutorials/creating-templates.md)

### Development
- [Develop with your local IDE and tools inside a Docker container](docs/tutorials/developing-inside-containers.md)

## Reference
For technical information in semantic.works, you can see the following references:
- [Commonly used headers](docs/reference/commonly-used-headers.md)
- [Representing logged in users](docs/reference/representing-logged-in-users.md)

## Discussions
If you want more information behind the design of semantic.works, you can read the following discussions:
- [Design philosophy](docs/discussions/design-philosophy.md)
- [Documentation structure](docs/discussions/documentation-structure.md)
- [Project categories](docs/discussions/project-categories.md)
- [Sharing authorization](docs/discussions/sharing-authorization.md)

## Writeups
Perspectives on...
- [Experimentation / all or nothing fails to innovate](writeups/perspectives/all-or-nothing-fails-to-innovate.md)
- [mu.semte.ch primer](writeups/perspectives/mu-semtech-primer.md)
- [Reactive programming](writeups/perspectives/)
- [Why semantic microservices](writeups/perspectives/why-semantic-microservices.md)
- [Why semantic tech](writeups/perspectives/why-semantic-tech.md)

Retrospectives on...
- [Dockercon EU 2017](writeups/retrospectives/dockercon-eu-2017.md)
- [Developerweek 2018](writeups/retrospectives/developerweek-2018.md)
- [OSLO²](writeups/retrospectives/oslo2.md)

- [Semantic.works - Implementing Docker multi-stage builds benefits](writeups/retrospectives/sw-implementing-docker-multi-stage-builds.md)
- [Semantic.works - Microservice reuse and authorization](writeups/retrospectives/sw-microservice-reuse-and-authorization.md)
- [Semantic.works - Supporting MacOS](writeups/retrospectives/sw-supporting-mac-os.md)

Or..
- Discover who governs semantic.works in [who - mu-semtech](writeups/who---mu-semtech.md)
- Find our external publications in [publictations](writeups/publications.md)
7 changes: 7 additions & 0 deletions TODO.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
- [ ] Expand semantic model example as requested by Aad

We should make and/or link (a) tutorial(s) for the following things:
- [ ] Docker & Docker Compose
- [ ] Linked data & SPARQL
- [ ] Ember.js
- [ ] Accept headers
Binary file not shown.
5 changes: 5 additions & 0 deletions archive/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# mu.semte.ch archive

The items in this folder have been archived for any (but not limited to) the following reasons:
- They were info about the mu.semte.ch/ blog
- The services they describe have been deprecated
43 changes: 43 additions & 0 deletions archive/auto-expanding-uploaded-semantic-files.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
**Editors note**
The following has been archived due to delta-service being superseded by delta-notifier. More info on the deprecation can be found in the [delta-service repo](https://github.com/mu-semtech/archived-mu-delta-service)

*This document has been adapted from Jonathan Langens' mu.semte.ch article. You can view it [here](https://mu.semte.ch/2017/06/01/auto-expanding-uploaded-semantic-files/)*

---

# Auto-expanding uploaded semantic files

By adding 2 new microservices to our regular [mu.semte.ch](http://mu.semte.ch/) setup ([https://github.com/mu-semtech/mu-project](https://github.com/mu-semtech/mu-project)) we can create a very nifty workflow that will automatically expand semantic files in our graph database.

## File uploader service
We need a file uploader service that after accepting a POST with a file saves that![](http://mu.semte.ch/wp-content/uploads/2017/05/fileservice-200x300.png) file and adds information about that file in our triple store. This information can be expressed using the following rather limited vocabulary:
A file is of class [http://mu.semte.ch/vocabularies/file-service/File](http://mu.semte.ch/vocabularies/file-service/File)
and has the following properties

* [http://mu.semte.ch/vocabularies/file-service/internalFilename](http://mu.semte.ch/vocabularies/file-service/internalFilename)
* [http://mu.semte.ch/vocabularies/file-service/filename](http://mu.semte.ch/vocabularies/file-service/filename)
* [http://mu.semte.ch/vocabularies/file-service/uploadedAt](http://mu.semte.ch/vocabularies/file-service/uploadedAt)
* [http://mu.semte.ch/vocabularies/file-service/status](http://mu.semte.ch/vocabularies/file-service/status)

## Semantic expander service
Next we need to have a semantic expander service. This service is a little bit more complicated because it handles 2 separate functionalities.

The first functionality that this service should have is support to consume delta’s as they are generated by the delta service ([https://github.com/mu-semtech/mu-delta-service](https://github.com/mu-semtech/mu-delta-service)). In these reports we will need to filter out the files that contain semantic data and whose status has changed to uploaded. We can achieve this in a rather brute way by first making a set of all URIs of the subjects in the insert reports. After this we can make a simple query that looks somewhat like this:

```
SELECT DISTINCT ?internal_filename
WHERE {
?uri a http://mu.semte.ch/vocabularies/file-service/File .
?uri http://mu.semte.ch/vocabularies/file-service/internalFilename ?internal_filename .
?uri http://mu.semte.ch/vocabularies/file-service/filename ?filename .
FILTER(strends(?filename, “.ttl”) || ?filename, “.rdf”))
FILTER(?uri in ([LIST OF ALL THE URI’s FOUND]))
FILTER NOT EXISTS {
?uri http://mu.semte.ch/vocabularies/semantic-expander/expanded ?date
}
}
```

This query will provide us with a list of filenames. We can now expand each of theses filenames. This can be done either (1) by converting the files to one or more insert queries, (2) by using a graph protocol to load an entire file or (3) by using store specific logic to load the files (i.e. using iSQL on Virtuoso to create a load list and then starting the load endpoint).

And tada! Whenever we upload a file with semantic content to our backend, the semantic expander service will pick it up automatically and load the contents in the triple store. Almost magic.
12 changes: 12 additions & 0 deletions archive/get-to-know-mu-cl-resources.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
**Editors note**
The following has been archived due to it being a blog post that was meant as a status update & linking to different blog posts.
The blog posts that the last two links refer to have been merged & imported into mu-projects' documentation. You can view this [here](https://github.com/mu-semtech/mu-project#creating-a-json-api)

*This document has been adapted from Erika Pauwels' mu.semte.ch article. You can view it [here](https://mu.semte.ch/2018/03/22/get-to-know-mu-cl-resources/)*

---

# Get to know mu-cl-resources
Long awaited, but it’s finally there: [a README for the mu-cl-resources service](https://github.com/mu-semtech/mu-cl-resources). Get to know all features (there are a lot!) of our microservice producing a [JSONAPI](http://jsonapi.org/) compliant API for your resources based on a simple configuration describing the domain. Find things such as [how to define your domain](https://github.com/mu-semtech/mu-cl-resources#configurationdomainlisp), [how to filter](https://github.com/mu-semtech/mu-cl-resources#basic-filtering), [how to paginate](https://github.com/mu-semtech/mu-cl-resources#pagination) and a lot more.

You can also have a look at our previous blog posts [Generating a JSONAPI compliant API for your resources, part 1](https://mu.semte.ch/2017/07/27/generating-a-jsonapi-compliant-api-for-your-resources/) and [part 2](https://mu.semte.ch/2017/08/17/generating-a-jsonapi-compliant-api-for-your-resources-part-2/) to get started.
48 changes: 48 additions & 0 deletions archive/the-delta-service-and-its-benefits.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
**Editors note**
The following has been archived due to delta-service being superseded by delta-notifier. More info on the deprecation can be found in the [delta-service repo](https://github.com/mu-semtech/archived-mu-delta-service)

*This document has been adapted from Jonathan Langens' mu.semte.ch article. You can view it [here](https://mu.semte.ch/2017/05/11/the-delta-service-and-its-benefits/)*

---

# The delta service and its benefits
The delta-service has already been mentioned and used in [previous blog posts on reactive programming](https://mu.semte.ch/tag/delta-service/) to make microservices hook into changes introduced by other microservices. In this article we will elaborate in more depth on this service. What is the delta-service and what are the benefits of using it?

## What is the delta-service?
![](http://mu.semte.ch/wp-content/uploads/2017/05/wild_e_delta-300x200.png)The delta-service is a microservice that offers a SPARQL endpoint. It accepts SPARQL queries, analyses them, calculates the differences a query introduces into the graph store and notifies interested parties about these differences.

The reports that will be sent to interested parties are formed with triples as the basic building blocks as triples are the resources that those SPARQL stores (mentally) have in common.

Conceptually such a report would looks somewhat like this:

+ <subject> <predicate> <object>
+ <subject> <predicate> <object>
- <subject> <predicate> <object>
- <subject> <predicate> <object>
- <subject> <predicate> <object>


All triples with a ‘+’ in front will be inserted and all triples preceded by ‘-’ are triples that will be deleted by the query.

## What are the benefits?

### Graph store independence
The delta-service has the obvious benefit that the framework can make abstraction from the graph store you are actually using. For instance, on a Virtuoso graph database both update (INSERT, DELETE) and select queries (SELECT, ASK, DESCRIBE) are sent to the same endpoint: [http://localhost:8890/sparql](http://localhost:8890/sparql) by default. On an OWLIM database on the other hand update queries are sent to [http://localhost:8001/openrdf-workbench/\[repository\]/statements](http://localhost:8001/openrdf-workbench/%5Brepository%5D/statements) while a select query is sent to [http://localhost:8001/openrdf-workbench/\[repository\]/query](http://localhost:8001/openrdf-workbench/%5Brepository%5D/query). The delta-service may even support idiosyncracies of a certain triple store like reforming queries if you know that the graph store cannot properly handle a specific query.

### Messaging
Additionally a service that can notify interested parties of the changes in the graph store can easily replace messages and message queues. While message queues do scale, they still require a shared mental model between the producing and the consuming microservices. This mental model is enforced by most messaging systems while with the delta-service you are free to subscribe to whatever you want. As long as it is expressible in a triple, a microservice can hook into it.

The producing microservice does not need to compose a specific message for the message queue. It just writes the triples to the triple store as it normally would (within the [mu.semte.ch](http://mu.semte.ch/) framework the triple store holds the ground truth) and the delta-service takes care of the rest. So here the win is: no mental model change.  Also the consuming microservice on the other hand does not need to know the transformation model the producing microservice uses to transform the message to the message queue format. Again it only needs to know triples.

And here we win a lot with the added semantic value of the controlled vocabularies used with linked data technology. For example should I install a microservice on my platform that performs actions every time a concept scheme is added to the SPARQL store, then that microservice just needs to listen to all changes on URIs that are a SKOS:ConceptScheme.

In short, while you can build a message queue with this technology the messages would always handle a view of the model but by being able to consume the delta reports all the reactiveness is being leveraged by nothing but manipulating the model.

### Scaling
The delta-service may also facilitate the scaling of the mu.semte.ch platform. If we would introduce a master graph database instance to which all updates (but preferably no selects) are proxied then we could put a delta-service in front of this master graph database and send mutation reports of the master graph database to a slave battery updater. This slave battery updater can then update an array of slave stores with every report. The slaves can be used for select queries. This way we have free scaling of the graph database and even more: we can have multiple different types of graph databases all in sync.

### Reactiveness
The delta-service enables us to use a new approach to developing applications: reactive programming. Make a microservice hook into changes produced by another microservice. Have a look at [the blog posts we’ve already published on this topic](https://mu.semte.ch/tag/reactive-programming/) to learn how to make a microservice reactive.

## Conclusion
The delta-service is just a microservice, but one that offers a lot of benefits. In the future we will publish more blog posts illustrating how you can use the delta-service in your mu.semte.ch project and easily benefit from the difference reports it generates.
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
**Editors note**
The following has been archived due to delta-service being superseded by delta-notifier. More info on the deprecation can be found in the [delta-service repo](https://github.com/mu-semtech/archived-mu-delta-service)
Also note that this is a direct import of the article written in 2017, before the master/slave terminology that is used within has come under larger scrutiny. I (the editor/importer) nor the person who originally wrote the article have made recent changes on it, and it has been asked to be left as-is, albeit archived.

*This document has been adapted from Jonathan Langens' mu.semte.ch article. You can view it [here](https://mu.semte.ch/2017/10/26/thoughts-on-how-a-distributed-sparql-endpoint-might-be-implemented/)*

---

# Thoughts on how a distributed SPARQL endpoint might be implemented
![](http://mu.semte.ch/wp-content/uploads/2017/10/replications.png)

The problem with most triple stores is … they tend to be slow. At least slow compared to their schema-full counterparts. It seems the cost of not defining a fixed schematic structure upfront is a not easy to avoid. We have been thinking about various models and solutions to mitigate this issue for specific applications. With the simplicity of the [mu.semte.ch](http://mu.semte.ch/) stack it is easy to add caches left and right but in the end that does not solve all problems. This post is intended to present an idea on how to add a solution by making the triple store distributed.

## What kind of distributed?
There are various models of distributed stores and maybe we explore other in subsequent posts.  In this post we present one way of distributing a triplestore across multiple instances.

The high-level idea is that a delta-service detects changes which would be introduced by update-queries.  These updates are dispatched to a set of replicated triplestores.  Queries and updates are ordered correctly to ensure updates are visible when a combination of read and write queries are executed.

## Terminology
A quick overview of the terminology used below as it is quite ambiguous.

- query: a SPARQL query that is not an insert or delete query.
- update: a SPARQL query that is an insert, delete or delete-insert-where query.
- SPARQL query: a query or an update.

## Architecture
A DSE (Distributed Sparql Endpoint) consists of a distributed SPARQL Endpoint Controller, a delta calculating microservice, a slave controller microservice, optionally a base slave store and then one or more slaves stores.

![](http://mu.semte.ch/wp-content/uploads/2017/10/distributedSPARQLEndpoint-1.png)

## Distributed SPARQL Endpoint Controller (DSEC)
The DSEC acts as the entrypoint for the DSE. The most basic form of its functionality would be to split the incoming SPARQL queries into queries and updates. Although updates change the database’s state, the DSEC is itself is almost fully stateless.  A notable exception is an increasing number attached to each incoming SPARQL query. This number allows us to prevent the execution of queries before all updates that were sent earlier have been applied.

## Delta Calculating Microservice
The delta microservice calculates which triples are changed based on a received query.  It has three distinct responsibilities:

1. Maintain a queue of pending updates;
2. Calculate the delta that would be introduced by executing the \[lowest\] update on the queue;
3. Execute those delta’s on it’s own \[master\] data store.

*Note: The Delta Calculating Microservice could maintain it’s state in a private SPARQL store to facilitate rebooting the component without loss of state should it go down. Rolling back should not happen as any update would be translated into an update of the form `WITH <graph> INSERT DATA { … triple … }`. In this case we can assume the underlying triple store to handle the cases where roll back is needed.*

## Slave Master Microservice
The Slave Master Microservice maintains the slaves stores’ lifecycle including creation and termination. This microservice also manages a queue of queries and uses a load balancing algorithm to assign the query to a slave. When the Delta Calculating Microservice sends out a delta report the Slave Master Microservice will inform all slaves of the change in state (in essence nothing more than running the above mentioned ‘simple’ update). The Slave Master knows the latest update number for each slave and can take this into account before sending queries.

## Conclusion
The ideas in this post build heavily on the reactive programming ideas presented in earlier posts. The main weakness of this approach is still the single point of failure that is the delta calculating component itself. This component could be made distributed by further enhancements.  It is currently the only component that processes update queries in this ecosystem. Maybe for a system with low write and high read access this would be a valuable asset. Keep tuned for the PoC!
1 change: 1 addition & 0 deletions assets/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Some of these files end with `excalidraw.svg`. This means that they were generated - and can be edited with - excalidraw.
Loading