-
-
Notifications
You must be signed in to change notification settings - Fork 163
OSH Word Evaluation Algorithm
This page documents a portion of the OSH implementation. It differs significantly from other shells in this respect.
- They tend to use a homogeneous tree with various flags (e.g.
nosplit
,assignment
, etc.). - OSH uses a typed, heterogeneous tree (now statically checked with MyPy).
For example, word_part = LiteralPart(...) | BracedVarSub(...) | CommandSub(...) | ...
https://github.com/oilshell/oil/blob/master/frontend/syntax.asdl#L107
(Specifying ML-like data structures with ASDL was an implementation style borrowed from CPython itself: see posts tagged #ASDL)
- As much parsing as possible is done in a single pass, with lexer modes.
- There are some subsequent tweaks for detecting assignments, tildes, etc.
- There is a "metaprogramming" pass for brace expansion:
i=0; {$((i++)),x,y}
There are three stages (not four as in POSIX):
- Evaluation of the typed tree. (using
osh/word_eval.py
)- There is a restricted variant of word evaluation for completion, e.g. so arbitrary processes aren't run with you hit TAB.
- Splitting with IFS. Ths is specified with a state machine in
osh/split.py
. (I think OSH is unique in this regard too.)- Splitting involves the concept of "frames", to handle things like
x='a b'; y='c d'; echo $x"${@}"$y
. The last part of$x
has to be joined withargv[0]
, andargv[n-1]
has to be joined with$y
.
- Splitting involves the concept of "frames", to handle things like
- Globbing.
There is no such thing as "quote removal" in OSH (e.g. any more than a Python or JavaScript interpreter has "quote removal"). It's just evaluation.
Bug: Internally, splitting and globbing both use \
to inhibit expansion. That is, \*
is an escaped glob. And \
is an escaped space (IFS character).
This causes problems when IFS='\'
. I think I could choose a different character for OSH, maybe even the NUL
byte.