-
-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Certain globs ending with "non-word" characters fail to match #18
Comments
thank you for looking into this and figuring out the issue! I've been working on getting these matching libs updated for the past few days, I'll get this fixed. thanks! |
Are you thinking about adding a "Unicode" option to I wanted to get a sense of your thoughts because, depending on the solution, all the packages in the dependency chain from Gulp down to nanomatch may need to be updated. |
Honestly I’m not sure yet. I’m open to suggestions
…Sent from my iPhone
On Nov 19, 2018, at 2:29 PM, Peter Safranek ***@***.***> wrote:
Are you thinking about adding a "Unicode" option to makeRe, or will all of the generated RegExps have the Unicode u flag by default? Or are you doing something entirely different?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Could you give me high-level explanation about why there are so many globbing packages: anymatch, micromatch, nanomatch, picomatch, etc? What's different about them? |
No, I dont have time to do that. But the projects each have descriptions that answer your question, and there are readme documents that took a long time to write and were created for that purpose. |
Fair enough. |
Using a unicode aware regex polyfill modified to include the exceptions noted seems like a feasible solution for this. Given that the fix will greatly expand the allowed characters, locking this behind a configuration switch or semver major release would be a good idea (particularly given the widespread usage of nanomatch). Semver major would prevent accumulation of technical debt, however dependent libraries might be better off with the switch option (lots of semver major version bumps otherwise). |
Please describe the minimum necessary steps to reproduce this issue:
Run this Node.js script:
What is happening (but shouldn't):
Output is
false
because theRegExp
test fails.What should be happening instead?
Output is
true
because theRegExp
test succeeds.What's happening
Here is the
RegExp
produced bynanomatch
:The word boundary matcher (starred) is the culprit. This matcher requires that the end of the first part of the glob
é
is a word boundary. There are two problems with the matcher:é
gets rejected as a word boundary. One solution is to add the Unicode flagu
to the end of theRegExp
. This is only a partial solution because...#
for example. If you replace theé
in this example with#
, the test fails even with the Unicode flag.Another odd behavior with this
RegExp
is that the first test here fails but the second test passes:The Unicode flag would be a good addition to un-break certain consumers of this library (see gulpjs/gulp#2153), but given the above odd behavior and above problem (2), it seems there might be some other consideration necessary.
The text was updated successfully, but these errors were encountered: