Skip to content

Latest commit

 

History

History
204 lines (146 loc) · 7.07 KB

README.md

File metadata and controls

204 lines (146 loc) · 7.07 KB

LDTR

A Linked Data (as in RDF) Transcriber.

Try out the Demo Web Application for visualization, format conversion and light editing.

Install the NPM Package.

Check out the Source Code on GitHub.


About

LDTR turns various representations of RDF into JSON-LD.

LDTR works by transcribing the input syntax verbatim into a valid JSON-LD structure that represents the same RDF. Only use output from this tool directly when input is strictly under your control. In any other case, LDTR includes a JSON-LD expansion implementation, to turn the data into a normalized, fully predicable RDF data representation. Use that if you want to use LDTR for general RDF data processing.

This tool strives to be usable both in the browser, on the command line and on the server (mainly NodeJS). It is built using modern JS and ES modules, along with some minimal provisioning for different runtimes.

Input Formats

Output Forms and Formats

For flexible compact JSON-LD, you can, and often should, also pass the result through a JSON-LD processor (such as jsonld.js) in order to to have control over the shapes and terms of the results.

While primarily designed to produce a native data structure representable as JSON-LD, LDTR also includes serializers for:

  • TriG
  • RDF/XML (including named graphs and RDF-star annotations)

These work similarly to the parsers, by transcribing the JSON-LD as directly as possible. This "shortcut" imposes some restrictions on the data, depending upon which compaction features of JSON-LD that the output format can faithfully transcribe. Regular "URI-to-PName" term compaction is guaranteed to work (which is the form of compact JSON-LD that the parsers output; representing a kind of common intersection between these formats).

Install

$ npm install ldtr

Command Line Usage

Examples:

$ ldtr RDF_FILE_OR_URL

$ cat TURTLE_FILE | ldtr

$ cat RDFA_FILE | ldtr -t html

$ ldtr RDF_FILE_OR_URL -o trig

CLI options:

 $ ldtr -h

Usage: ldtr [options] [arguments]

Options:
  -t, --type TYPE               Media type or file suffix
  -b, --base BASE               Base URL if different from input URL
  -e, --expand                  Expand JSON-LD
  -i, --index                   Index on keys, types and reverses
  -p, --pattern                 Use RDFa pattern copying
  -o, --output OUTPUT           Media type or file suffix
      --max-redirects NUMBER
  -v, --verbose
  -h, --help

Library Usage

Use the top level interface: read and write, with input data, and optionally a media type if it isn't "obvious" (e.g. a DOM Document, an URL or a file with a common suffix).

For text-based formats, the input is expected to be a regular string. For XML- and HTML-based formats, the input can also be a DOM Document. (Any W3C XML DOM Level 2 Core compliant DOMParser and XMLSerializer will do.)

In a browser, you can use the internals by themselves. See the demo web application for an example of how.

Parsing:

import * as ldtr from 'ldtr'

let data

// Guess type by suffix
data = await ldtr.read('some-data.trig')

// Supply file path and type
data = await ldtr.read('some-data.trig', 'application/trig')

// Supply URL and use respone content-type
data = await ldtr.read('http://www.w3.org/1999/02/22-rdf-syntax-ns')

// Supply URL and type
data = await ldtr.read('http://example.org', 'application/trig')

// Supply data and type
data = await ldtr.read({ data: '<a> :b "c" .', type: 'text/turtle' })

// Parse RDF/XML from a DOMDocument
doc = new DOMParser().parseFromString(rdfStr, 'text/xml')
data = await ldtr.read({data: doc})

// Parse RDFa from a DOMDocument
doc = new DOMParser().parseFromString(rdfStr, 'text/html')
data = await ldtr.read({data: doc})

Internals

The TriG parser is generated from a grammar file (based on the TriG W3C EBNF Grammar) using PEG.js.

By default on Node (e.g. when using the CLI) LDTR uses xmldom for HTML and XML parsing.

(Caveat: Internal XML entity declarations are not handled by xmldom yet.)

Rationale

RDF is about meaning, not structure. Of course, meaning is always – indirectly but intrinsically – conveyed by structure. And if the structure is yours to begin with, you can leverage its shape directly for speed and convenience. As such, the practise of using JSON-LD as plain JSON is a bit like using C. Very effective and close to the metal, but rather dangerous if you don't know what you're doing.

To a certain point, this tool can be used as a teaching aid, for showing the isomorphisms of different RDF serializations. Note that prefix mechanisms (QNames/CURIEs/PNames) are basically only useful in RDF syntaxes for humans, when reading and writing data directly.

Crucially, they are not intended to be handled directly (syntactically, from the source) in code. Thus, by producing a JSON-LD compliant semi-compact transcript like LDTR does, consumers who are unaware of what the tokens really mean (in RDF) may be misled to consider them fixed and atomic, instead of the locally defined shorthand forms they really are. This is why this form can only be trusted when you are in control of the source data. When you are, however, the compact form can keep both data and code fairly succinct, which may be of benefit to certain applications. You do trade away general RDF processing by doing so though. It's a matter of tradeoffs.