diff --git a/spec.emu b/spec.emu
index 1ae004e..276d1d4 100644
--- a/spec.emu
+++ b/spec.emu
@@ -27,8 +27,6 @@ contributors: Jordan Harband
1. Let _str_ be ? ToString(_S_).
1. Let _cpList_ be StringToCodePoints(_str_).
- 1. Let _punctuators_ be the following String, which consists of every ASCII punctuator except U+005F (LOW LINE): *"(){}[]|,.?\*+-^$=<>\/#&!%:;@~'"`"*.
- 1. Let _toEscape_ be StringToCodePoints(_punctuators_).
1. Let _escapedList_ be a new empty List.
1. For each code point _c_ in _cpList_, do
1. If _escapedList_ is empty and _c_ is matched by |DecimalDigit|, then
@@ -36,19 +34,8 @@ contributors: Jordan Harband
1. Append code unit U+0078 (LATIN SMALL LETTER X) to _escapedList_.
1. Append code unit U+0033 (DIGIT THREE) to _escapedList_.
1. Append _c_ to _escapedList_.
- 1. Else if _toEscape_ contains _c_ or _c_ is matched by |WhiteSpace|, then
- 1. Let _hex_ be Number::toString(𝔽(_c_), 16).
- 1. If the length of _hex_ is 1 or 2, then
- 1. Set _hex_ to StringPad(_hex_, 2, *"0"*, ~start~).
- 1. Append code unit U+0078 (LATIN SMALL LETTER X) to _escapedList_.
- 1. Append the code units in _hex_ to _escapedList_.
- 1. Else,
- 1. Assert: The length of _hex_ is at most 4.
- 1. Set _hex_ to StringPad(_hex_, 4, *"0"*, ~start~).
- 1. Append code unit U+0075 (LATIN SMALL LETTER U) to _escapedList_.
- 1. Append the code units in _hex_ to _escapedList_.
1. Else,
- 1. Append _c_ to _escapedList_.
+ 1. Append the code units in EncodeForRegExpEscape(_c_) to _escapedList_.
1. Return CodePointsToString(_escapedList_).
@@ -56,6 +43,38 @@ contributors: Jordan Harband
`escape` takes a string and escapes it so it can be literally represented as a pattern. In contrast EscapeRegExpPattern (as the name implies) takes a pattern and escapes it so that it can be represented as a string. While the two are related, they do not share the same character escape set or perform similar actions.
+
+
+
+ EncodeForRegExpEscape (
+ _c_: a code unit,
+ ): a List of code units
+
+
+
+
+ 1. Let _codeUnits_ be a new empty List.
+ 1. Let _punctuators_ be the following String, which consists of every ASCII punctuator except U+005F (LOW LINE): *"(){}[]|,.?\*+-^$=<>\/#&!%:;@~'"`"*.
+ 1. Let _toEscape_ be StringToCodePoints(_punctuators_).
+ 1. If _toEscape_ contains _c_ or _c_ is matched by |WhiteSpace|, then
+ 1. Append code unit U+005C (REVERSE SOLIDUS) to _codeUnits_.
+ 1. Let _hex_ be Number::toString(𝔽(_c_), 16).
+ 1. If the length of _hex_ is 1 or 2, then
+ 1. Set _hex_ to StringPad(_hex_, 2, *"0"*, ~start~).
+ 1. Append code unit U+0078 (LATIN SMALL LETTER X) to _codeUnits_.
+ 1. Else,
+ 1. Assert: The length of _hex_ is at most 4.
+ 1. Set _hex_ to StringPad(_hex_, 4, *"0"*, ~start~).
+ 1. Append code unit U+0075 (LATIN SMALL LETTER U) to _codeUnits_.
+ 1. Append the code units in _hex_ to _codeUnits_.
+ 1. Else,
+ 1. Append _c_ to _codeUnits_.
+ 1. Return _codeUnits_.
+
+