Skip to content

Commit

Permalink
[osh] Fix ${a[@]:-}, ${a[*]:-}, etc. to depend on string value, not a…
Browse files Browse the repository at this point in the history
…rray (#2215)

This follows the behavioral change from Bash 4.4 to 5.0.

- Update doc/ref/chap-word-lang topics: op-bracket, and op-test
- [spec/var-op-test] Extend tests for `${@:-}`

---------

Co-authored-by: Andy C <[email protected]>
  • Loading branch information
akinomyoga and Andy C authored Jan 3, 2025
1 parent cdee8ea commit c326e7c
Show file tree
Hide file tree
Showing 5 changed files with 432 additions and 16 deletions.
116 changes: 112 additions & 4 deletions doc/ref/chap-word-lang.md
Original file line number Diff line number Diff line change
Expand Up @@ -189,6 +189,82 @@ or second forms, it is interpreted as the third form of the variable name
surrounded by operators.


### op-bracket

The value within brackets is called an "index", and retrieves a value from an
array:

${A[i+1]}
${A['key']}

If `A` is an indexed array, the index is interpreted as an arithmetic
expression. Arithmetic evaluation is performed, and the value at that numeric
offset is retrieved.

If `A` is an associative array, the index is interpreted as a string. The
value associated with that string is retrieved.

If `A` is a string, it's treated as an indexed array with a single element,
i.e. so that `${A[0]}` is `${A}`.

---

${A[*]}
${A[@]}

The index expressions `[*]` and `[@]` are special cases. Both generate a word
list of all elements in `a`.

When the variable substitution is **unquoted**, there's no difference between
`[*]` and `[@]`:

$ A=(1 2 3)
$ printf '<%s>\n' ${A[*]}
<1>
<2>
<3>

$ printf '<%s>\n' ${A[@]}
<1>
<2>
<3>

When double-quoted, the `[*]` form joins the elements by the first character of
`IFS`:

$ IFS=x
$ printf '<%s>\n' "${A[*]}"
<1x2x3>

When double-quoted, the `[@]` form generates a word list by splitting the word
at the boundary of every element in `A`:

$ printf '<%s>\n' "-${A[@]}-"
<-1>
<2>
<3->

If the container `A` has no elements, and the variable substitution has no
other parts, `[@]` evaluates to an empty word list:

$ empty=()
$ set -- "${empty[@]}"
$ echo $#
0

---

These rules for `[*]` and `[@]` also apply to:

- `$*` and `$@`
- `${!name*}` and `${!name@}`
- `${!name[*]}` and `${!name[@]}`, etc.

<!--
Note: OSH currently joins the values by `IFS` even for unquoted `$*` and
performs word splitting afterward. This is different from the POSIX standard.
-->

### op-indirect

The indirection operator `!` is a prefix operator, and it interprets the
Expand All @@ -213,6 +289,42 @@ This idiom is also useful:

: ${LIB_OSH=stdlib/osh}

---

There are test operators with colons, and without:

${x-default}
${x:-default}

${x=default}
${x:=default}

${x+other}
${x:+other}

${x?error}
${x:?error}

**Without** the colon, the shell checks whether a value is **defined**. In the
case of a word list, e.g. generated by `$*` or `$@`, it tests whether there is
at least one element.

**With** the colon, the shell checks whether the value is **non-empty** (is not
the empty string). In the case of a word list, the test is performed after
joining the elements by a space.

Elements are joined by the first character of `IFS` only with double-quoted
`"${*:-}"`.

In contrast, `${*:-}`, `${@:-}`, and `"${@:-}"` are joined by a space. This is
because the joining of `"$*"` by `IFS` is performed earlier than the joining by
space for the test.

<!--
Note: OSH currently joins the values by `IFS` even for unquoted `$*`. This is
different from Bash.
-->

### op-strip

Remove prefixes or suffixes from strings:
Expand Down Expand Up @@ -253,10 +365,6 @@ The pattern can also be a glob:
$ echo ${x//[a-z]+/o} # replace multiple chars
o

### op-index

echo ${a[i+1]}

### op-slice

echo ${a[@]:1:2}
Expand Down
4 changes: 2 additions & 2 deletions doc/ref/toc-osh.md
Original file line number Diff line number Diff line change
Expand Up @@ -142,11 +142,11 @@ X [Unsupported] enable
arith-sub $((1 + 2))
tilde-sub ~/src
proc-sub diff <(sort L.txt) <(sort R.txt)
[Var Ops] op-indirect ${!x}
[Var Ops] op-bracket ${a[i+1]}, ${a[*]}
op-indirect ${!x}
op-test ${x:-default}
op-strip ${x%%suffix} etc.
op-patsub ${x//y/z}
op-index ${a[i+1}
op-slice ${a[@]:0:1}
op-format ${x@P} ${x@Q} etc.
```
Expand Down
49 changes: 41 additions & 8 deletions osh/word_eval.py
Original file line number Diff line number Diff line change
Expand Up @@ -640,6 +640,7 @@ def _ApplyTestOp(
part_vals, # type: Optional[List[part_value_t]]
vtest_place, # type: VTestPlace
blame_token, # type: Token
vsub_state, # type: VarSubState
):
# type: (...) -> bool
"""
Expand Down Expand Up @@ -684,14 +685,45 @@ def _ApplyTestOp(
else:
is_falsey = False

elif case(value_e.BashArray):
val = cast(value.BashArray, UP_val)
# TODO: allow undefined
is_falsey = len(val.strs) == 0
elif case(value_e.BashArray, value_e.BashAssoc):
if val.tag() == value_e.BashArray:
val = cast(value.BashArray, UP_val)
strs = bash_impl.BashArray_GetValues(val)
elif val.tag() == value_e.BashAssoc:
val = cast(value.BashAssoc, UP_val)
strs = bash_impl.BashAssoc_GetValues(val)
else:
raise AssertionError()

elif case(value_e.BashAssoc):
val = cast(value.BashAssoc, UP_val)
is_falsey = len(val.d) == 0
if tok.id in (Id.VTest_ColonHyphen, Id.VTest_ColonEquals,
Id.VTest_ColonQMark, Id.VTest_ColonPlus):
# The first character of IFS is used as a separator only
# for the double-quoted "$*", or otherwise, a space " " is
# used (for $*, $@, and "$@").
# TODO: We current do not check whether the current $* is
# double-quoted or not. We should use IFS only when $* is
# double-quoted.
if vsub_state.join_array:
sep_width = len(self.splitter.GetJoinChar())
else:
sep_width = 1 # we use ' ' for a[@]

# We test whether the joined string will be empty. When
# the separator is empty, all the elements need to be
# empty. When the separator is non-empty, one element is
# allowed at most and needs to be an empty string if any.
if sep_width == 0:
is_falsey = True
for s in strs:
if len(s) != 0:
is_falsey = False
break
else:
is_falsey = len(strs) == 0 or (len(strs) == 1 and
len(strs[0]) == 0)
else:
# TODO: allow undefined
is_falsey = len(strs) == 0

else:
# value.Eggex, etc. are all false
Expand Down Expand Up @@ -1542,7 +1574,8 @@ def _EvalBracedVarSub(self, part, part_vals, quoted):
# '') is not applied to the VTest operators such as
# ${a:-def}, ${a+set}, etc.
if self._ApplyTestOp(val, op, quoted, part_vals,
vtest_place, part.name_tok):
vtest_place, part.name_tok,
vsub_state):
# e.g. to evaluate ${undef:-'default'}, we already appended
# what we need
return
Expand Down
3 changes: 2 additions & 1 deletion spec/array.test.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
## compare_shells: bash mksh
## oils_failures_allowed: 3
## oils_failures_allowed: 2

#### nounset / set -u with empty array (bug in bash 4.3, fixed in 4.4)

Expand Down Expand Up @@ -385,6 +385,7 @@ ls foo=(1 2)
# 2024-06 - bash 5.2 and mksh now match, bash 4.4 differed.
# Could change OSH
# zsh agrees with OSH, but it fails most test cases
# 2025-01 We changed OSH.

single=('')
argv.py ${single[@]:-none} x "${single[@]:-none}"
Expand Down
Loading

0 comments on commit c326e7c

Please sign in to comment.