-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Perf: Optimize ignoreModule calls in no-cycle #2612
base: main
Are you sure you want to change the base?
Perf: Optimize ignoreModule calls in no-cycle #2612
Conversation
/** | ||
* Cache of resolved paths for faster subsequent lookups | ||
* @type {{ [importPath: string]: string }} | ||
*/ | ||
const resolvedPaths = {}; | ||
/** | ||
* Cache of external module evaluations for faster subsequent lookups | ||
* @type {{ [resolvedPath: string]: boolean }} | ||
*/ | ||
const externalModules = {}; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these caches need to be cleared in the same place traversed
is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Program:exit
hook is called at the exit of each file, so clearing the caches there cut the number of cache hits in our repo by about 50% (although it only added ~45s to our repo test case)
The high level attempt at this update was to quick filter down to "maybe external" modules like 'react'
, '@reduxjs/toolkit'
or '@/some/alias/path'
- where those resolved paths can be safely cached across files 🤔
Is that possible to accomplish with any of the existing data structures?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, hmm - you're right, it's less about clearing it after the file and more about clearing it after the "run" - since in editors, eslint can be long-running.
I think that in every file, there could be a different resolver, so the cache would have to be on the resolver and the resolver config and the fully normalized file path, i think?
// import paths are set in resolvedPaths, eg "./index.js", or "../file.js" | ||
// should not be stored because there could be duplicates that resolve to | ||
// different paths | ||
if (!options.ignoreExternal || name[0] === '.') return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couldn't a resolver choose to configure a relative path as external?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 I suppose so, there's probably some monorepo setup out there like that...
I think removing this assumption means that we can't confidently cache the resolved module path we looked up because then there could be duplicate import paths that resolve to different files (eg import paths of './index.js'
).
Trying to approach this from a different angle: The current time to determine if a module is "external" is the performance bottleneck and this PR is attempting to short circuit that whenever possible.
Do you have suggestions on an approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the no-cycle rule option ignoreExternal
could be a boolean or a regex, and if it's a regex it's used for this short circuit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what the best configuration fields are, but doing a quick test with a regex short circuit has about the same performance improvement as the cache implementation.
Alternative approach tested:
const ignoreModule = (name) => {
if (!options.ignoreExternal) return false;
if (options.ignoreExternalRegex) {
return options.ignoreExternalRegex.test(name);
}
const path = resolve(name, context);
let isExternal = externalModules[path];
if (isExternal === undefined) {
isExternal = isExternalModule(name, path, context);
externalModules[path] = isExternal;
}
return isExternal;
};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the better question is, why is isExternalModule
slow?
Perhaps the node and webpack resolvers can add a cache there, rather than inside the main plugin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't we doing a bunch of regex matching in isExternalModule
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generally that's fast, but if it can be sped up, let's!
Codecov ReportBase: 95.29% // Head: 95.31% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #2612 +/- ##
==========================================
+ Coverage 95.29% 95.31% +0.01%
==========================================
Files 68 68
Lines 2955 2967 +12
Branches 1002 1004 +2
==========================================
+ Hits 2816 2828 +12
Misses 139 139
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
Co-authored-by: Jordan Harband <[email protected]>
Hello 👋 - Thanks for taking the time to review this PR
Summary
This PR adds a few performance optimizations to the
no-cycle
rule to make calls toignoreModule
faster.Details
Lots of details because digging into this was really interesting
I was recently profiling our ESLint suite and the
no-cycle
rule was responsible for 79% of the total lint time. I found #2348 and attempted to speed up the rule by opting IN to theignoreExternal
option for the rule, but surprisingly that increased our total lint time from ~5min to ~9min which seemed odd 🤔This comment had a very helpful hint that lead to the issue: #2076 (comment)
When the
ignoreExternal
option is enabled theignoreModule
will callresolve
for each module to determine if it's an external module. I added some counters and found that our repo, which has ~2,000 modules, ended up calling this check ~215,000 times (that number wasn't surprising since each file does a breadth first search of the file's entire dependency tree) so attempting to ignore external modules can actually add a lot of work callingresolve
Optimization
This update optimizes this codepath by adding:
Results
For our medium sized codebase this update:
ignoreExternal
Note for these comparisons I disabled all the other rules that use
ExportMap
to try and ensure rule order and building the export map wasn't variable.