diff --git a/build/doc.sh b/build/doc.sh index a36bc8fb0..961c59684 100755 --- a/build/doc.sh +++ b/build/doc.sh @@ -101,6 +101,7 @@ readonly MARKDOWN_DOCS=( qsn qtt j8-notation + htm8 # Protocol pretty-printing stream-table-process diff --git a/doc/htm8.md b/doc/htm8.md new file mode 100644 index 000000000..8e696e445 --- /dev/null +++ b/doc/htm8.md @@ -0,0 +1,164 @@ +--- +in_progress: yes +default_highlighter: oils-sh +--- + +HTM8 - Efficient HTML with Errors +================================= + +- Syntax Errors: It's a Subset +- Efficient + - Easy to Remember + - Easy to Implement + - Runs Efficiently - you don't have to materialize a big DOM tree, which + causes many allocations + +
+
+ +## Basic Structure + +### Text Content + +Anything except `&` and `<`. + +These must be `&` and `<`. + +`>` is allowed, or you can escape it with `>`. + +### 3 Kinds of Character Code + +1. `&` - named +1. `ϧ` - decimal +1. `ÿ` - hex + +### 3 Kinds of Tag + +1. Start +1. End +1. StartEnd + +### 2 Kinds of Attribute + +1. Unquoted +1. Quoted + +### 2 Kinds of Comment + +1. `` +1. `` (XML processing instruction) + + +## Special Rules, From HTML + +### 2 Tags Cause Special Lexing + +- `` in a string literal? +- Foreign ` ` - XML rules + +## TODO + +- `` and `` are foreign XML content? Doh + - So I can just switch to XML mode in that case + - TODO: we need a test corpus for this! + - maybe look for wikipedia content +- can we also just disallow these? Can you make these into external XML files? + +This is one way: + + + + +Then we don't need special parsing? + diff --git a/doc/ysh-doc-processing.md b/doc/ysh-doc-processing.md index 0091368c7..40e974cfa 100644 --- a/doc/ysh-doc-processing.md +++ b/doc/ysh-doc-processing.md @@ -134,22 +134,7 @@ Safe HTML subset If you want to take user HTML, then you first use an HTML5 -> HT8 converter. -## Algorithms - -### Emitting HX8 as HTML5 - -Just emit it! This always work. - -### Converting HX8 to XML - -- Always quote all attributes -- Always quote `>` - are we alloxing this in HX8? -- Do something with `