Graml is a lightweight graph mark-up language based on YAML. It is designed to represent human-readable graphs more compactly than alternatives such as JSON (GraphSON) or XML (GraphML).
- YAML-based graph representations
- Java reference implementation
- Tinkerpop Blueprints support for OrientDB Neo4j MongoDB and more!
- Customizable section names
- Graph serialization to Graml
- DB class creation
- Stream processing
To enable Graml in your project, simply declare the dependency:
Gradle:
compile 'com.segfly.graml:graml:1.0.0'
Maven:
<dependency>
<groupId>com.segfly.graml</groupId>
<artifactId>graml</artifactId>
<version>1.0.0</version>
</dependency>
Graml requires a specific data structure and can not read arbitrary YAML. Given a Graml-compliant file, getting started is simple:
TinkerGraph g = new TinkerGraph();
GramlReader graml = new GramlReader(g);
graml.load(new File ("/path/to/file.yaml"));
Using an empty in-memory TinkerGraph we construct a GramlReader
.
We then can load any number of Graml-formatted YAML representations.
Graml has an intuitive structure and syntax. YAML's inline lists (via [ ] ) and associative-lists (via { } ) allow simple one to many relationships on a single line.
graml: {version: 1.0}
graph:
apple: {growsOn: tree}
tree: {dependsOn: water, exhale: O2}
water: {precipitatesAs: [rain, snow, sleet, hail]}
Graml contains five main sections:
Section | Required | Purpose |
---|---|---|
header | Yes | Metadata |
classmap | No | Object-oriented DB support |
graph | Yes | Graph relationships |
vertices | No | Vertex properties |
edges | No | Edge properties |
The required header must be present:
graml: {version: 1.0}
The optional classmap section supports object-oriented databases such as OrientDB. When resolving a vertex or edge, an entity's class is determined by this section.
classmap:
fruit: [apples, oranges, pears]
plant: tree
verb: growsOn
Another way to think of this is in terms of how it affects names of entities sent to the Blueprints API. Consider the following graph with the above classmap example:
graph:
apple: {growsOn: tree}
This will result in two vertices with the names "fruit:apples" and "plant:tree" and an edge "verb:growOn". An underlying implementation like OrientDB will use the prefix to map the entity to a class within the database. The Graml reference implementation will not create the classes. Future versions may address this shortcoming.
The required graph section captures relationships between vertices. The most simple expression is a nested associative array representing vertex-edge-vertex:
graph:
parent: {edge: child}
This will create verticies named "parent" and "child" connected with an edge named "edge". Whether this is a bidirectional or unidirectional relationship is determine by the underlying graph database.
A more complex example:
graph:
apple: {growsOn: tree}
tree: {dependsOn: water, exhale: O2}
water: {precipitatesAs: [rain, snow, sleet, hail]}
The optional entity sections allow the assignment of properties and values to vertices and edges in the graph section.
vertices:
apple:
color: red
weight: 200g
ripe: yes
edges:
dependsOn:
required: true
Graml currently has the following limitations:
Graml does not support multiple vertices with the same name in accordance with the Yaml specification.
graph:
apple: {growsOn: tree}
apple: {influencedBy: gravity}
The above will result in an apple vertex with one edge "gravity". Future versions of Graml will likely detect this and throw an exception.
If multiple instances of an apple are desired, they should be created with specific names:
classmap:
apple: [myApple, yourApple]
graph:
myApple: {grownIn: Washington}
yourApple: {grownIn: Japan}
The example above uses an apple class to characterize the two apples as a single type. If your graph database implementation does not support classes, the same could be done with a "class" relationship.
Multiple edges with the same name may be defined throughout the graph. However, edge properties are assigned globally:
edges:
grownIn:
date: October
Carrying forward with the myApple/yourApple example, the above edge property would have been applied to the "grownIn" edge for both apples.
This may be addressed in the future with edge IDs.
Graml's design prioritizes ease-of-use over speed and scale. If you have huge graphs, Graml may not be for you. At least not yet.
A future version may incorporate some degree of stream-processing to better handle huge graphs. The current implementation expects to process the entire graml file at once which places memory limits on the size of graph.
Contributions welcome! If you want to find out how you can get involved, please read the contributing guidelines.