Skip to content

Commit

Permalink
[doc] Polish doc on command vs. expression mode
Browse files Browse the repository at this point in the history
Thanks to ThinkChaos for feedback and pointing out typos.

Also fix a bug where we omitted code blocks that require Pygments.
  • Loading branch information
Andy C committed Jan 10, 2025
1 parent 986fbea commit 11a83e8
Show file tree
Hide file tree
Showing 3 changed files with 176 additions and 36 deletions.
85 changes: 66 additions & 19 deletions doc/command-vs-expression-mode.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,27 +5,76 @@ default_highlighter: oils-sh
Command vs. Expression Mode
===========================

This is an essential [syntactic concept](syntactic-concepts.html) in YSH.
[YSH][] extends the shell **command** language with a Python-like
**expression** language.

YSH extends the shell **command** language with a Python-like **expression**
language.
Commands and expressions each have a **lexer mode**, which is an essential
[syntactic concept](syntactic-concepts.html) in YSH.

To implement that, the lexer enters "expression mode".
This doc lists the places where [YSH][] switches between modes.

The key difference is that when lexing commands, `unquoted` is a string, while
`$dollar` is a variable:
[YSH]: $xref

ls /bin/str $myvar

On the other hand, when lexing expressions, `'quoted'` is a string, while
`unquoted` is a variable:
<div id="toc">
</div>

## Summary

A main difference is whether you write strings like `unquoted` or `'quoted'`,
and whether you write variables like `$dollar` or `unquoted`:

<style>
thead { text-align: left; }
table {
width: 100%;
margin-left: 2em; /* match */
}
</style>

<table>

- thead
- Description
- Lexing Mode
- String
- Variable
- Example
- tr
- Shell-Like
- Command
- `unquoted`
- `$dollar`
- ```
ls foo/bar $myvar
```
- tr
- Python-like
- Expression
- `'quoted'`
- `unquoted`
- ```
var s = myfunc('str', myvar)
```
This doc lists the places where we switch modes.
</table>
More examples:
ls foo/bar # foo and bar are strings - command
var x = foo / bar # foo and bar are the names of variables - expression
And:
echo $filename.py # $filename is a var - command
var x = filename ++ '.py' # filename is a var - expression
<!--
Shell has a similar difference:
ls foo/bar # foo and bar are strings
a=$(( foo/bar )) # foo and bar are the names of variables
-->
<div id="toc">
</div>
## From Command Mode to Expression Mode
Expand All @@ -46,7 +95,7 @@ This includes *bare assignments* in Hay blocks:
### `=` and `call` keywords
Likewise, everything after `=` or `::` is in expression mode:
Likewise, everything after `=` or `call` is in expression mode:
= 42 + f(x)
Expand Down Expand Up @@ -102,8 +151,8 @@ Lazy arguments:
Parameters aren't expressions, but they're parsed with the same lexer:
proc p(x, y) { # what's between () is in expression mode
echo "$x $y" # back to command mode
proc p (x, y) { # what's between () is in expression mode
echo "$x $y" # back to command mode
}
func f(x) {
Expand All @@ -128,7 +177,7 @@ This is a command literal:
var b = ^(echo $PWD)
## Examples
## More Examples
### How Are Glob Patterns Written in Each Mode?
Expand Down Expand Up @@ -167,5 +216,3 @@ syntax:
echo 'Python'
}


## vim: sw=2
98 changes: 95 additions & 3 deletions doc/table-object-doc.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,12 @@ This is part of **maximal** YSH!
<div id="toc">
</div>

## Philosophy

- Oils is Exterior-First
- Tables, Objects, Documents - CSV, JSON, HTML
- Oils cleanup: TSV8, JSON8, HTM8

## Tables


Expand Down Expand Up @@ -60,12 +66,20 @@ This is part of **maximal** YSH!
- JSON Schema
- tr
- Document
- HTML5, XML
- HTML5
- DOM API like getElementById() <br/>
CSS selectors <br/>
XPath?
- JSX Templates
- XML Schema?
- ?
- tr
- Document
- XML
- XPath? XQuery?
- XSLT?
- three:
- DTD (document type definition, 1986)
- RelaxNG (2001)
- XML Schema aka XSD (2001)

<!-- TODO: ul-table should allow caption at the top -->
<caption>Existing</caption>
Expand Down Expand Up @@ -125,7 +139,85 @@ This is part of **maximal** YSH!
- MySQL: XML extraction functions only
- sqlite: none

## Design Issues

### Streaming

- jq has a line-based streaming model, by taking advantage of the fact that
all JSON can be encoded without literal newlines
- HTML/XML don't have this property
- Solution: Netstring based streaming?
- can do it for both JSON8 and HTM8 ?

### Mutual Nesting

- JSON must be UTF-8, so JSON strings can contain JSON
- ditto for JSON8, and J8 strings
- TSV cells can't contain tabs or newlines
- so they can't contain TSV
- if you remove all the newlines, they can contain JSON
- TSV8 cells use J8 strings, so they can contain JSON, TSV
- HTM8
- you can escape everything, so you can put another HTM8 doc inside
- and you can put JSON/JSON8 or TSV/TSV8
- although are there whitespace rules?
- all nodes can be liek `<pre>` nodes, preserving whitespace, until
- you apply another function to it

### HTML5 whitespace rules

- inside text context:
- multiple whitespace chars collapsed into a single one
- newlines converted to spaces
- leading and trailing space is preserved
- `<pre> <code> <textarea>`
- whitespace is preserved exactly as written
- I guess HTM8 could use another function for this?
- quoted attributes
- whitespace is untouched

## Related

- [stream-table-process.html](stream-table-process.html)
- [ysh-doc-processing.html](ysh-doc-processing.html)

## Notes

### RelaxNG, XSD, DTD

I didn't know there were these 3 schema types!

- DTD is older, associated with SGML created in 1986
- XML Schema and Relax NG created in 2001
- XML Schema use XML syntax, which is bad!


### Algorithms?

- I looked at `jq`
- how do you do CSS selectors?
- how do you do JSONPath?

- XML Path
- holistic twig joins - bounded memory
- Hollandar Marx XPath Streaming


### Naming

- HTM8 doesn't use J8 strings
- but TSV8 does

- Technically we could add j8 strings with
- j''
- and even templated strings with $"" ?
- hm
- well then we would need $[ j'' ] and so forth

Is

- `<span x=j'foo'>` identical to `<span x="j'foo'">` in HTML5 ?
- it seems do
- ditto for `$""`
- then we could disallow those pattern in double quotes?
- they would have to be quoted like &sq; or something
29 changes: 15 additions & 14 deletions doctools/oils_doc.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,11 @@
from doctools.util import log
from lazylex import html

try:
import pygments
except ImportError:
pygments = None


class _Abbrev(object):

Expand Down Expand Up @@ -300,22 +305,13 @@ def __init__(self, s, start_pos, end_pos, lang):
self.lang = lang

def PrintHighlighted(self, out):
try:
from pygments import lexers
from pygments import formatters
from pygments import highlight
except ImportError:
log("Warning: Couldn't import pygments, so skipping syntax highlighting"
)
return

# unescape before passing to pygments, which will escape
code = html.ToText(self.s, self.start_pos, self.end_pos)

lexer = lexers.get_lexer_by_name(self.lang)
formatter = formatters.HtmlFormatter()
lexer = pygments.lexers.get_lexer_by_name(self.lang)
formatter = pyments.formatters.HtmlFormatter()

highlighted = highlight(code, lexer, formatter)
highlighted = pygments.highlight(code, lexer, formatter)
out.Print(highlighted)


Expand Down Expand Up @@ -492,6 +488,11 @@ def HighlightCode(s, default_highlighter, debug_out=None):
out.SkipTo(slash_code_left)

else: # language-*: Use Pygments
if pygments is None:
log("Warning: Couldn't import pygments, so skipping syntax highlighting"
)
continue

# We REMOVE the original <pre><code> because
# Pygments gives you a <pre> already

Expand All @@ -503,8 +504,8 @@ def HighlightCode(s, default_highlighter, debug_out=None):
break
tag_lexer.Reset(slash_code_right, end_pos)
assert tok_id == html.EndTag, tok_id
assert tag_lexer.TagName(
) == 'pre', tag_lexer.TagName()
assert (tag_lexer.TagName() == 'pre'
), tag_lexer.TagName()
slash_pre_right = end_pos

out.PrintUntil(pre_start_pos)
Expand Down

0 comments on commit 11a83e8

Please sign in to comment.