-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normalize char before pattern lookup in FuzzyMatchV1 #4252
Conversation
efe3ef2
to
102da78
Compare
There is an edge-case in FuzzyMatchV1 during backward scan, related to normalization: if string is initially denormalized (e.g. Unicode symbol), backward scan will proceed further to the next char; however, when the score is computed, the string is normalized first, then scanned based on the pattern. This leads to accessing pattern index increment, which itself leads to out-of-bound index access, resulting in a panic. To illustrate the process, here's the sequence of operations when search is perfored: 1. during backward scan by "minim" pattern ``` xxxxx Minímal example ^^^^^^^^^^^^ |||||||||||| miniiiiiiiim <- compute score for this substring ``` 2. during compute score by "minim" pattern ``` Minímal exam minimal exam <- normalize chars before computing the score ^^^^^^ |||||| minim <- at this point the pattern is already fully scanned and index is out-of-the-bound ``` In this commit the char is normalized during backward scan, to detect properly the boundaries for the pattern.
102da78
to
93746a7
Compare
Thanks, I can reproduce the problem.
Just out of curiosity, how did you find this problem? |
Merged, thanks! |
@junegunn It was quite a journey. 😄 After an upgrade of fzf (I guess it was an update from 0.49 to 0.59), the fuzzy search using Nvim, fzf and fzf.vim started to throw errors. Namely, if search was performed by term |
Wow, great job! Thanks for the help! |
There is an edge-case in FuzzyMatchV1 during backward scan, related to normalization: if string is initially denormalized (e.g. Unicode symbol), backward scan will proceed further to the next char; however, when the score is computed, the string is normalized first, then scanned based on the pattern. This leads to accessing pattern index increment, which itself leads to out-of-bound index access, resulting in a panic.
To illustrate the process, here's the sequence of operations when search is perfored:
In this commit the char is normalized during backward scan, to detect properly the boundaries for the pattern.