-
Notifications
You must be signed in to change notification settings - Fork 6
Home
Welcome to the wiki of gordf parser. This document will act as a report for the work done by Rishabh Bhatnagar in the period of Google Summer Of Code (2020). Checkout gsoc-logs to see a detailed and time-stamped log of tasks and explanation about the changes made to the original design and some key considerations in the development phase.
- Project Guided and Mentored By
- Student Details
- Overview of GoRDF
- RDFLoader
- RDFLoader Phase 1: XML Reader
- RDFLoader Phase 2: RDF Parser
- RDFWriter
- UseCase and Applications of GoRDF
- Code Entry point
Name | Rishabh Bhatnagar |
---|---|
Github | /RishabhBhatnagar |
/bhatnagarrishabh4 | |
/bhatnagar-rishabh |
GoRDF is a library made using GoLang which facilitates a RDF/XML parser which can be used to deal with RDF data format. Since the RDF/XML is a data format that is heirarchial in nature and is independent of the other structures in the document, there are tags which can be parsed without any dependency on the other tags. GoRDF makes an extensive use of concurrency feature of the GoLang to facilitate computational speedups without changing semantics of the linear RDF parser. The Library provides two major Functionalities:
- RDFLoader: Generates triples from a RDF/XML file.
- RDFWriter: Writes triples to a RDF/XML file.
The module was written and developed in two phases:
- XML Reader: Parses the XML structure of the file and returns a rootBlock if the input xml file is a valid RDF document else reports an error.
- RDF Parser: Uses the rootBlock given as an input by the previous phase to generate RDF Triples out of it.
- For representing a RDF file, we can have many data formats like xml, json, NTriples, yaml, etc.
- Two phase allows the code to be easily switched among the data fragments without rewriting the entire code.
- For adding a support for new representation of RDF format, programmer just have to parse the given file structure and store it into the block formats without writing the entire code to generate the rdf triples out of it.
- Easier Testing and Debugging.
github.com/spdx/tools-golang/gordf/rdfloader
This is the first phase in the rdf loader provided by the GoRDF module. It provides an interface for reading the XML file and returning a rootBlock or the error if any encountered while parsing the xml structure. XML Reader acts as a dependency of the RDF Loader. This two phase structure of the GoRDF allows the programmers to easily change the XML Reader with any other reader of other representations of the RDF Format.
github.com/spdx/gordf/rdfloader/xmlreader
Invocation Method: XMLReader.Read()
XMLReader Obtained from:
Read function returns a RootBlock:
RootBlock | -> | Block |
Block | -> | OpeningTag Children BlockValue| BLOCK |
OpeningTag | -> | Tag |
Tag | -> | SchemaName Name Attributes | TAG |
Attributes | -> | SingleAttribute Attributes | ϵ |
SingleAttribute | -> | Name SchemaName AttributeValue | ATTRIBUTE |
Name | -> | STRING |
SchemaName | -> | STRING |
AttributeValue | -> | URI_STRING |
Children | -> | Blocks |
Blocks | -> | Block Blocks | ϵ |
BlockValue | -> | STRING |
RootBlock └───Block ├───BlockValue │ └───STRING ├───Children │ └───Blocks │ ├───Block │ └───ϵ └───OpeningTag ├───Attributes │ ├───Attributes │ ├───SingleAttribute │ │ ├───AttributeValue │ │ │ └───URI_STRING │ │ ├───Name │ │ │ └───STRING │ │ └───SchemaName │ │ └───STRING │ └───ϵ ├───Name │ └───STRING └───SchemaName └───STRING
Example For XML Reader: 1-xmlreader
- 3e71687: Initial version of XMLReader
- d8bd2d8: Example For XML-Reader
- c428e8a: New Package for dealing with URIs
- d3bb55e: Splitted xmlReader into Utils and Reader for modularity
- aee9bfa: Wrote Tests for XML-Reader utils
This is the second phase in the rdf loader provided by the GoRDF module. It provides an interface for generating triples from the rootBlock the error if any encountered. RDF Parser acts as a dependency of the RDF Loader. The RDF Parser works independently without any dependency from the previous phase. User can easily change the phase one and provide relevant data structures required by this phase for parsing.
github.com/spdx/gordf/rdfloader/rdfparser
Invocation Method: Parse( RootBlock )
Parse Function returns an error if any. And, populates the parser with SchemaDefinition and Triples
SchemaDefinition:
Map with Key and Value as Strings. Key represents the abbreviation and Value is the absolute URI.
For example: key="rdf", value="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
Grammar of Triples
Triples | -> | Triple Triples | ϵ |
Triple | -> | Subject Predicate Object | TRIPLE |
Subject | -> | Node |
Predicate | -> | Node |
Object | -> | Node |
Node | -> | NodeType ID | NODE |
NodeType | -> | LITERAL | RESOURCELITERAL | NODEIDLITERAL | BLANK | IRI |
ID | -> | STRING |
Triples ├───Triple │ ├───Object │ │ └───Node │ │ ├───ID │ │ └───NodeType │ ├───Predicate │ │ └───Node │ │ ├───ID │ │ └───NodeType │ └───Subject │ └───Node │ ├───ID │ └───NodeType └───ϵ
Example For RDF Parser: 2-rdfparser
- b70edda: Add RDF Parser
- dd3e455: Example For RDF Parser
- 3169b32: Tests For RDF Parser
- 348ba4c, 68d0d13: Support Concurrency
- 3998529: Tests And Bug Fixes
- 2595dfa: Allow Parser Triples to be a slice of Triples instead of a Map
- f7b9334: Add Support For CDATA Tags
- bfa2733: Closure For Terminal Tags
- 1a36ab0: Use String Instead Of Pointer As Keys For A Map
- 752599d: NodeToTriples Now returns a dynamic slice of unique triples
This is the first phase in the rdf loader provided by the GoRDF module. It provides an interface for reading the XML file and returning a rootBlock or the error if any encountered while parsing the xml structure. XML Reader acts as a dependency of the RDF Loader. This two phase structure of the GoRDF allows the programmers to easily change the XML Reader with any other reader of other representations of the RDF Format.
github.com/spdx/gordf/rdfwriter
Invocation Methods: WriteToFile(writer, triples, schemaDefinition)
writer: io.Writer Object in which the content will be written.
triples: List of unordered triples with structure same as this
error if any is encountered while serialising the triples into RDF/XML format.
Example For RDF Writer: 4-rdfwriter
- 503dea6: Biggest Update wrt RDF Writer. Adds Topological Sorting, Utils for RDF Writer and, Tests For the same. Almost everything of the RDF writer is added in this commit.
- f9e43c5: Add Support For Default NameSpaces
- 1a36ab0: Map of nodeToTriples now use strings as the keys instead of pointers providing a better and sturdy querying.
- tools-golang: for more info, read "Application Of GoRDF (tools-golang)"
tools-golang is a collection of Go packages intended to make it easier for Go
programs to work with SPDX® files.
Till 27th August 2020, repository provides following functionalities:
- Tag Value Loader
- Tag Value Saver
- SPDX Document builder
- Compare Licenses
For more examples and use cases, refer examples
The main branch doesn't provide functionality for RDF Loading and Writing SPDX
files into RDF format. There's a new branch called
gordf which attempts to
add these supports to the library.
Currently, gordf branch doesn't allow writing SPDX document to RDF format, but
it allows user to load their RDF files into SPDX Document or validate it.
- RDFLoader
github.com/spdx/tools-golang/gordf/rdfloader
Invocation Method: Load2_2(reader) reader is a io.Reader object from where the rdf file content will be read from.
Read function returns a spdxDocument and error if any.