l10n data: *.lol

The Mozilla l20n project aims to supersede l10n formats like Gettext, mostly by allowing C-style macros in l10n resources. An implementation is in progress: https://github.com/zbraniecki/l20n

The main issue comes from the specific 'LOL' format that is specific to the l20n project. Here’s a proposal to make the 'LOL' format less weird and more extensible.

'LOL' Format?

Example: test.lol

The 'LOL' format is a mix of:

C-style comments and logical expressions
JSON-like values: string | array | list
DTD-like entities (ugh!)

Entities:

<identifier[index]? content? attributes?>
 
 - identifier = key string + optional [index] list
 - content    = value (generally: HTML element content)
 - attributes = « key: value » sequence
 
where:
 - each index is a C-style logical expression
 - each value is a JSON object: string | array | list

Macros:

<identifier(params) {expression}>

Comments:

/* C-style comment */

Identifiers, values, expressions and comments make perfect sense, but to meet our needs we need a nestable format for entities instead of the 'LOL' one.

Entity Groups

By design, 'LOL' entities are not nestable. This gets in the developer’s way when dealing with more complex structures than just HTML elements.

For our project, we have to group entities by language and by media queries; this would be straight-forward with a JSON/YAML-like variant (see the lang.json and lang.yaml files in this directory), but impossible with the current LOL syntax.

More generally, there’s no reason why an l10n format should be specific to HTML or XML, or why it should set a limitation to translatable data structures: a template engine or UI toolkit library might benefit from nested entities.

'LOL' entities are written like this: <ID content? attributes?>. Grouping / nesting entities would be straight-forward if entities used the ID: value form, which is common to web-related languages (CSS, JSON, YAML…).

The « identifier: value » Paradigm

In order to use ID: value instead of <ID content? attributes?> for l10n entities, we have to express the content and attributes as a single object. As an example:

<test
    "hello, world!"
    style: "color: red;"
    title: "failed">

can be written in JSON like this:

test: {
    &textContent: "hello, world!",
    .style: "color: red;",
    .title: "failed"
}

…which is what we do in our *.intl format proposal: attributes are prefixed with . (like in 'LOL' expressions), properties are prefixed with &.

In 'LOL' entities, the value is assigned to the element’s innerHTML property, which raises performance and security issues. The more generic & prefix is more verbose but sharper, as it can be used to specify whether the value should be applied to innerHTML or textContent (the latter being much safer).

The main point of this ID: value approach is that it is more generic, and it allows to group/nest entities, as required by our project. It also allows to use a much more consistent and conventional syntax.

Why bother?

The l20n project brings a generic solution (expressions) to handle complex grammatical rules, which is great. It relies on a specific and unconventional syntax, which feels wrong — and not needed. As an example, here’s what the same data looks like…

By design, LOL entities are not nestable. This gets in the developer’s way when dealing with more complex structures than just HTML elements. For our project, we have to group entities by language and by media queries; this would be straight-forward with a JSON/YAML-like variant (see the lang.json and lang.yaml examples), but impossible with the current 'LOL' syntax.

There’s a point where a high WTF-factor becomes a technical issue. We believe we’re beyond that point.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

l10n data: *.lol

'LOL' Format?

Entity Groups

The « identifier: value » Paradigm

Why bother?

Clone this wiki locally