-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decrease input-lag for large queries #158
Conversation
packages/language-support/src/highlighting/syntaxValidation/semanticAnalysisWrapper.ts
Show resolved
Hide resolved
packages/language-support/src/highlighting/syntaxValidation/syntaxValidation.ts
Show resolved
Hide resolved
} | ||
|
||
const proxyWorker = (await pool.proxy()) as unknown as LintWorker; | ||
lastSemanticJob = proxyWorker.validateSemantics(query); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could pass positions here to avoid doing the the findposition in the main thread
"sideEffects": false, | ||
"scripts": { | ||
"build": "concurrently 'npm:build-types' 'npm:build-esm' 'npm:build-commonjs'", | ||
"build": "echo done", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, this I need to correct. I only got the workerpool working in codemirror by not bundling. But we still need to build for non-typescript consumers
page.on('worker', (worker) => { | ||
console.log('Worker created: ' + worker.url()); | ||
worker.on('close', (worker) => | ||
console.log('Worker destroyed: ' + worker.url()), | ||
); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't, I just used it to debug why the e2e tests were failing
if (document.length === 0) { | ||
this.config.setUseLightVersion?.(false); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are assuming someone will wipeout the whole editor always to revert back to the antlr parsing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I was originally considering a parse time threshold (if prism parses under Xms it's probably fine with antlr), but that risks some really odd behaviour for outliers. Some queries might be fast with antlr and slow with prism for unexpected reasons and we risk rapid flipping between parsers if the thresholds are poorly set.
Flipping back when the document is simple to understand, and in the worst case scenario (paste large -> empty all -> paste large again), we're still avoiding the main issue (delay between inputs) as we switch back to prism after the first parse
import { LinterTask, LintWorker } from './lint-worker'; | ||
|
||
const pool = workerpool.pool(join(__dirname, 'lint-worker.js'), { | ||
minWorkers: 2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need 2 workers? I think you explained me this already.
I tried including a console.log
of the pool.stats()
at the beginning of the linting here and at the beginning of the semantic linting and I can see 1 occupied worker, 1 idle one at most.
{totalWorkers: 2, busyWorkers: 1, idleWorkers: 1, pendingTasks: 0, activeTasks: 1}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intention is to always have an idle worker available, so that we can avoid the startup times of spawning a new thread. My reasoning is that when one thread is busy, we can use the idle one without startup cost and then the old busy thread will be terminated and respawned so it's ready for the next call
I've not measured the start up time/performance gains though
1f197f8
to
4062fce
Compare
@@ -47,6 +49,7 @@ | |||
"watch": "tsc -b -w" | |||
}, | |||
"devDependencies": { | |||
"@types/lodash.debounce": "^4.0.9", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the types needed here if we have the whole package in the dependencies?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lodash project doesn't have any types, they are provided by the "@types" package from microsoft who add types to untyped libraries, that's why it's imported separately. The reason it's in devDependencies is because the types are only needed at the build step
@@ -29,7 +31,27 @@ export function doSemanticAnalysis(query: string): SemanticAnalysisResult { | |||
const errors: SemanticAnalysisElement[] = semanticErrorsResult.$errors.data; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the things coming back from the transpiled semantic analysis include the severity? So they should not be a SemanticAnalysisElement[]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, thanks! updated now 👍
}, | ||
"devDependencies": { | ||
"@neo4j-ndl/base": "^1.10.1", | ||
"@playwright/experimental-ct-react": "^1.39.0", | ||
"@playwright/test": "^1.36.2", | ||
"@types/lodash.debounce": "^4.0.9", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment, do we need the types package here if it's in the compile dependencies?
private debouncedOnChange = this.props.onChange | ||
? debounce(this.props.onChange, 200) | ||
: undefined; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the description of the pr is outdated judging by this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes - my bad
* proof of concept * merge main * works again * fix overuse of main channel * mellan * tests work * fix worker thread for node as well * self review * self review * self review * fix build * tests * fix e2e test * merge semantic analsysis and syntax errors again * review comments * self review * fix tests * worker pool-ish for vscode * smarter 'pool' mgmt * kindaworks * works * cleanup new parser adapter * restore pkg json * workerpool * worker pool for client as well * cleanup workers * fix errors * fix tests * fix todos * fix build * test-are-green * works for vite * fixlint * self review * re-add proper build * fix input lag even when parse is fast * add changeset * fix e2e test * pr comments
Overview
This ended up being quite a bit larger than I had anticipated/hoped for, but the main changes are:
workerpool
pkg to create a pool of semantic analysis workers in both codemirror & the server itself. It will cancel outdated and unfinished semanatic analsysis runs when new ones get triggered. I also debounced it for the language server so it fires when the user has stopped typing (chose 600ms somewhat arbitrarily, codemirror has a built in debounce of 750ms for comparison)tokenize
method. In my testing it is about ~20x our parser.A few future tasks:
[ ] Figure out a smarter way to bundle (see note on it below)
[ ] Reduce duplication of the worker code
[x] Improve react perf by debouncing the on change there as well
Outdated description below
Since the semantic analysis takes so much longer than the normal errors, I split it to it's own call.
I started by trying to get the same
runSemanticAnalysis
webworker implementation working everywhere with theweb-worker
node package, but it turned out to be really tricky to get it to work in all combinations of environment (jest/web/plain node) and targets (esm/cjs). I ended up leaving the function more or less as it was and then I do an esm only webworker in the react-codemirror package and then I use nodes built inworker_threads
to get it to work in the server!Future improvement - smarter debouncing
When the document is very big, it's easy to queue up a few semantic analysis calls in the webworker (even when debounced) since they can take so long. I think it'd make sense to make the calls into a queue and when one call is finished, all but the latest call are discarded. I carded this here: https://trello.com/c/EvujJSED/219-dont-start-new-semantic-analysis-while-the-old-one-is-still-running-debounce-while-running
Bundling concerns
We should consider how we can split the bundle like this:
Right now only the worker loads the semantic analysis but it's still in the main bundle. My attempt at splitting failed because it was tricky to configure the worker to do the dynamic import. The other option is if we don't bundle, the client application will use their dead code elimination to make sure it doesn't get in the final bundle - i think this would be easier. I've added a card: worker-import-meta-url