Skip to content

Commit

Permalink
utf8_hop_forward: Don't go over edge of buffer
Browse files Browse the repository at this point in the history
even in the presence of malformed UTF-8.

This preserves previous behavior of if you start at one byte past the
edge of the buffer, it returns that position.
  • Loading branch information
khwilliamson committed Sep 30, 2024
1 parent cfe4427 commit 5613dc6
Showing 1 changed file with 13 additions and 4 deletions.
17 changes: 13 additions & 4 deletions inline.h
Original file line number Diff line number Diff line change
Expand Up @@ -2659,7 +2659,7 @@ start of the next character.
C<off> must be non-negative.
C<s> must be before or equal to C<end>.
C<s> must be before or equal to C<end>. If after, the function panics.
When moving forward it will not move beyond C<end>.
Expand All @@ -2677,19 +2677,28 @@ Perl_utf8_hop_forward(const U8 *s, SSize_t off, const U8 *end)
* the bitops (especially ~) can create illegal UTF-8.
* In other words: in Perl UTF-8 is not just for Unicode. */

assert(s <= end);
assert(off >= 0);

if (UNLIKELY(s >= end)) {
if (s == end) {
return (U8 *) end;
}

Perl_croak_nocontext("panic: Start of forward hop (0x%p) is %zd bytes"
" beyond legal end position (0x%p)",
s, 1 + s - end, end);
}

if (off && UNLIKELY(UTF8_IS_CONTINUATION(*s))) {
/* Get to next non-continuation byte */
do {
s++;
}
while (UTF8_IS_CONTINUATION(*s));
while (s < end && UTF8_IS_CONTINUATION(*s));
off--;
}

while (off--) {
while (off-- && s < end) {
STRLEN skip = UTF8SKIP(s);
if ((STRLEN)(end - s) <= skip) {
GCC_DIAG_IGNORE(-Wcast-qual)
Expand Down

0 comments on commit 5613dc6

Please sign in to comment.