-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creating a new class with an auto-allocated ID #56
Comments
I like [C] but we should also have guardrails to make sure a permanent ID is assigned to the entity before the PR is merged. Else it would open all kinds of curation nightmares. Obviously it will not happen considering the fact that curators are the gatekeepers of what goes into an ontology or not. But do you think if KGCL can even enforce something like that? One option is to add merge rules in GitHub but haven't done it enough to confidently say if that is even a possibility. There is a |
Maybe this is a good use case for implementing |
The way I was envisioning this, the permanent ID would be assigned by the KGCL engine when the KGCL data is processed – so by the time a PR is created, this would already be done. That is, if I ask for the following changes:
the KGCL engine (either the Python library or my KGCL-Java) would:
And then Ontobot would go on and create the PR with those changes. Yes, such a workflow would create a risk of having concurrent PRs with clashing IDs. If there is already a PR with an auto-assigned ID waiting to be merged, and someone asks for another class creation in a second PR before the first one is merged, then the second PR would end up with the same auto-assigned ID. But this is a risk that already exists right now with manually created PRs. Still, if that is a concern, we could assign “temporary” auto-generated IDs that would have to be converted to permanent IDs in a later step using something like @balhoff’s |
While we are doing this should we also consider:
This uses blank node syntax which I'm 75% sure is a bad idea (what if at some point in the future we want to allow blank nodes qua blank nodes) But the same thing could be achieved using a marker prefix, such as (Seeing the |
My current thinking (theoretical only; I have not written any code for that yet) is to make the prefix configurable at the application-level, with a default of That is, by default it would be possible to do:
but if someone wants to use KGCL on an ontology where |
Here is what is currently implemented in KGCL-Java (in the master branch only for now, but will be available in the upcoming 0.4.0 release): The There are three “modes“ to determine how the identifiers are generated: “Manual“ mode. Identifiers are generated according to parameters passed on the command line, with the following options:
To minimise the risk of simultaneous PRs having the same auto-generated IDs, the identifiers are picked randomly within the specified range (existing IDs in the ontology are always checked first and explicitly avoided). ID range policy mode. Similar to manual mode, but the parameters are taken from a
If Temporary mode. Generates randomly generated, temporary identifiers that should later be replaced by a KGCL-Java also includes a putative implementation of such a |
Currently, users wanting to create a new class using KGCL are expected to know in advance the ID of the class to be created, so that they can issue a
create class ID:1234 "label"
command.This is hardly compatible with the intended use of KGCL in bug tracker tickets.
There would be several ways to address the problem.
A. Non-technical solution. Leave KGCL as it is, but expect that ontologies should have a ID range specifically intended for KGCL change and document that range to users.
Not ideal as it puts all the burden of allocating the ID to the users (who must first figure out what is the range allocated to KGCL-mediated changes, and then find out what is the lowest non-used ID in that range).
This is, in effect, the current situation.
B. Deal with auto IDs at the level of the Ontobot. Leave KGCL as it is. Agree on a special keyword (for example
ID:auto
) to use in the KGCL DSL syntax, and have Ontobot automatically replace that keyword by a suitable auto-generated ID before actually passing the KGCL data to the KGCL library. It’s up to the Ontobot to figure out how to allocated ID (probably by parsing the-idranges.owl
file, if such a file exists).C. Similar as B, but at the level of KGCL itself. That is, the KGCL DSL explicitly defines the
ID:auto
keyword, and KGCL libraries are expected to know that they should automatically allocate an ID when this keyword is used.I currently think this would be the best solution.
Both B and C would allow an user to something like this:
D. Add variables to the KGCL DSL. Make it possible to do something like this:
Technically speaking the most elegant solution, but I don’t think we want to add such constructs to the KGCL DSL syntax – which is expected to be a simple syntax for mostly non-technical users.
The text was updated successfully, but these errors were encountered: