-
Notifications
You must be signed in to change notification settings - Fork 643
Egg #3: Structured String Literals
In many programming languages, there is often a dilemma when working with structured data types such as XML, JSON or S-Expressions. Usually, the programmer could either use the native syntax of the particular language in the form of strings, lacking the expected type safety; or the programmer could use a type-safe abstract representation, loosing much of the readability. Structured string literals seek to solve this problem by providing a type-safe compile-time of parsing structured data, while still providing the usual safety guarantees one could expect in a staticly-typed language.
- String Interpolation in Scala
- User-defined Literals in C++
- Quasiquotation in Haskell
- Type-specific languages in Wyvern
The basic idea for creating a structured string literal parser, should be to implement a type class akin to the following:
class Parse t where
parse : String -> Result (Description t)
where the relevant data types are:
data Result a = Success a | Error String
data Description : Type -> Type where
Escape : String -> Description t
Ap : Description (a -> t) -> Description a -> Description t
Pure : t -> Description t
Essentially the Description
data type is a defunctionalisation of applicative functors/idiom brackets, but with the additional property that some parts of the expression might be escaped Idris expressions of the relevant data type.
Since structured string literal parsers are usually defined in modules, using structured string literals is as simple as importing the relevant module.
Syntactically, structured data literals would start with a sigil @
followed by an optional Idris identifier and end with a string literal. The structured string literal parser author is free to choose whatever representation for escaped expressions the author desires (although a recommendation should be made to ensure consistency).
Examples of literals:
xmlPerson : String -> Nat -> Xml
xmlPerson name age = @"""<person>
<name>$name</name>
<age>$age</age>
</person>
"""
prettiedPerson : String -> Nat -> String
prettiedPerson name age = @format"name: %s=name\t %d=age"
Semantically, structured string literal resolution would for a given target type t
find the relevant Parse instance, parse the content and then try to reify the structure to t
.
If an identifier was provided, then the relevant instance would be the named instance that has the identifier as name, e.g. the format
structured string literal would correspond to instance [format] Parse String where ...
.
JSON is a popular language for representing structured data. An example of a JSON document containing a persons name, age and a list of hobbies would look something like the following:
{
"name" : "John Doe",
"age" : 42,
"hobbies" : [ { "type" : "Capoeira", "club" : "São Paulo Hobbyist Capoeira Club"},
{ "type" : "Go", "club" : "International Go Club - Los Angeles" } ]
}
However this notation is not available in Idris, and therefore the corresponding structure must be represented as abstract syntax internally. A translation of the above document to abstracts could look something like the following:
JsonObject([
("name", JsonString "John Doe"),
("age", JsonInt 42),
("hobbies",
JsonArray [ JsonObject [("type", JsonString "Capoeira"), ("club", JsonString "São Paulo Hobbyist Capoeira Club")],
JsonObject [("type", JsonString "Go"), ("club", JsonString "International Go Club - Los Angeles") ]])
])
While the abstract representation provides a good approximation of the concrete syntax, its readability is somewhat reduced due to verbosity. Furthermore, the programmer is required to map between the different representations which might introduce some unforeseen bugs.
With structured string literals, the programmer can have the readability of the concrete syntax with the type safety of the abstract syntax. For example:
jdDoc : Json
jdDoc = @"""
{
"name" : "John Doe",
"age" : 42,
"hobbies" : [ { "type" : "Capoeira", "club" : "São Paulo Hobbyist Capoeira Club"},
{ "type" : "Go", "club" : "International Go Club - Los Angeles" } ]
}
"""
In this case the compiler (with support for structured string literals) would automatically parse the literal to JSON and check that it is well structured.
helloWorld = """Hello
World"""
now works.
Binary Packages
Tool Support
Community
- Libraries, available elsewhere
- Idris Developer Meetings
- The Zen of Idris
- Non English Resources
Development