-
Notifications
You must be signed in to change notification settings - Fork 26
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed few things, also code style for new code is different, it's written in .clang-tidy file
} | ||
|
||
auto find_next_delim() { | ||
return std::search(this->data_.begin(), this->data_.end(), bytes_.begin(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to use string_view::find for short single delimiter
https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/string_view.tcc#L48
because std::search
is naive and doesn't use memchr/memcmp
https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/stl_algobase.h#L2150
And https://en.cppreference.com/w/cpp/utility/functional/boyer_moore_horspool_searcher for long single delimiter
return std::find_if(data_.begin(), data_.end(), | ||
[&](auto c) { return bytes_[c]; }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
input string can contains bytes which isn't ascii :(
I think better to check it via condition (instead of make bigger bytes_
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CHAR_MAX
commonly 127, so bytes_
handle only positive (ascii) chars in such case.
It's ok for delimiter (because valid utf-8 with single char contains only ascii),
but it's invalid for any byte in utf-8 input, because any multiple byte character have bytes like 0b1*******
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah I see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix codestyle
Add support for multi delimiter analyzer.
ToDo: