Skip to content

Commit

Permalink
parser: ensure let stmt compount assignment removal suggestion respec…
Browse files Browse the repository at this point in the history
…t codepoint boundaries

Previously we would try to issue a suggestion for `let x <op>= 1`, i.e.
a compound assignment within a `let` binding, to remove the `<op>`. The
suggestion code unfortunately incorrectly assumed that the `<op>` is an
exactly-1-byte ASCII character, but this assumption is incorrect because
we also recover Unicode-confusables like `➖=` as `-=`. In this example,
the suggestion code used a `+ BytePos(1)` to calculate the span of the
`<op>` codepoint that looks like `-` but the mult-byte Unicode
look-alike would cause the suggested removal span to be inside a
multi-byte codepoint boundary, triggering a codepoint boundary
assertion.

Issue: <rust-lang#128845>
  • Loading branch information
jieyouxu committed Aug 9, 2024
1 parent 92520a9 commit f7b2ce5
Showing 1 changed file with 6 additions and 2 deletions.
8 changes: 6 additions & 2 deletions compiler/rustc_parse/src/parser/stmt.rs
Original file line number Diff line number Diff line change
Expand Up @@ -408,10 +408,14 @@ impl<'a> Parser<'a> {
fn parse_initializer(&mut self, eq_optional: bool) -> PResult<'a, Option<P<Expr>>> {
let eq_consumed = match self.token.kind {
token::BinOpEq(..) => {
// Recover `let x <op>= 1` as `let x = 1`
// Recover `let x <op>= 1` as `let x = 1` We must not use `+ BytePos(1)` here
// because `<op>` can be a multi-byte lookalike that was recovered, e.g. `➖=` (the
// `➖` is a U+2796 Heavy Minus Sign Unicode Character) that was recovered as a
// `-=`.
let extra_op_span = self.psess.source_map().start_point(self.token.span);
self.dcx().emit_err(errors::CompoundAssignmentExpressionInLet {
span: self.token.span,
suggestion: self.token.span.with_hi(self.token.span.lo() + BytePos(1)),
suggestion: extra_op_span,
});
self.bump();
true
Expand Down

0 comments on commit f7b2ce5

Please sign in to comment.