Moves syntax errors to the semantic analysis worker #317

ncordon · 2024-12-23T10:28:44Z

What

Up until now we were computing all of the syntax errors as a listener on parsing, but we want to move to a pattern of supporting both syntax and semantic errors in a separate thread (web worker).

Why

Because we want to be able to support both Cypher 5 and Cypher 25, plus minor versions for Cypher 25 and this gives us flexibility to just reload the web worker for the appropriate version, while auto-completing with the latest Cypher 25 grammar.

The semantic analysis packages both Cypher 5 and Cypher 25 knowledge. This would mean some auto-completions could potentially be too modern, but at least we would mark the errors appropriately to the user (right now for latest Cypher 5 or LTS and latest Cypher 25, in the future even for minor Cypher 25 versions if we need to).

changeset-bot · 2024-12-23T10:28:49Z

🦋 Changeset detected

Latest commit: 94d289a

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 5 packages

Name	Type
@neo4j-cypher/language-support	Patch
@neo4j-cypher/language-server	Patch
@neo4j-cypher/react-codemirror-playground	Patch
@neo4j-cypher/react-codemirror	Patch
@neo4j-cypher/schema-poller	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

…sis-worker

ncordon · 2025-01-06T14:46:58Z

packages/language-support/src/parserWrapper.ts

+        collectedCommand.type !== 'cypher' && !isEmptyStatement
+          ? errorListener.errors
+          : [];


Only surface the syntax errors that are coming from console commands, let the web worker compute the errors for Cypher statements

ncordon · 2025-01-06T14:48:19Z

packages/language-support/src/syntaxValidation/completionCoreErrors.ts

-    // options length is 0 should only happen when RULE_consoleCommand is hit and there are no other options
-    if (
-      ruleCandidates.find(
-        (ruleNumber) => ruleNumber === CypherParser.RULE_consoleCommand,
-      )
-    ) {
-      return 'Console commands are unsupported in this environment.';
-    }


We don't need this anymore

ncordon · 2025-01-06T15:16:24Z

packages/react-codemirror/src/lang-cypher/lintWorker.ts

+  if (featureFlags.consoleCommands !== undefined) {
+    _internalFeatureFlags.consoleCommands = featureFlags.consoleCommands;
+  }


Before the internal feature flags (which are a global variable) only affected things that were run in the same thread than the editor (namely the syntax errors). Now we are moving the syntax errors down to the semantic analysis worker, so this change requires us to set that global variable there as well.

There's been talks for reshaping the way we package the language support into a class, but until then, this is the way forward.

ncordon · 2025-01-06T15:17:14Z

packages/react-codemirror/src/lang-cypher/langCypher.ts

@@ -46,7 +37,6 @@ export function cypher(config: CypherConfig) {
      optionClass: completionStyles,
    }),
    cypherLinter(config),
-    semanticAnalysisLinter(config),


We only need a single linter because all of the errors live in the semantic analysis worker now.

ncordon · 2025-01-06T15:25:03Z

packages/language-support/src/tests/syntaxValidation/syntacticValidation.test.ts

+      const query = `RETURN {\`something
+    foo
+
+    bar: "hello"}`;


The error multiline unfinished backticked constructs has worsened and it would require some work in the mono-repo to fix it

ncordon · 2025-01-06T15:26:28Z

packages/language-support/src/tests/syntaxValidation/syntacticValidation.test.ts

+  // TODO FIX ME
+  // Problem here is we were getting a better error before
+  test.fails('Syntax validation errors on an expected procedure name', () => {


The error for this case has also worsened, so this would have to be addressed from the mono repo

ncordon · 2025-01-06T15:29:21Z

packages/language-support/src/tests/syntaxValidation/syntacticValidation.test.ts

+        message: `Invalid input '"foo"': expected a graph pattern, a parameter, ')', ':', 'IS', 'WHERE' or '{'`,
        offsets: {
-          end: 14,
+          end: 15,
          start: 9,
        },
        range: {
          end: {
-            character: 14,
+            character: 15,


Because of the way we are computing the positions finish in the mono repo, some errors are spanning new characters now.

That is, when we receive a syntax error, we just know where the token starts, but not where it finishes. And the only approach we have in that case is to consider the token finishes at the next non space position.

ncordon · 2025-01-06T15:29:40Z

packages/react-codemirror/src/e2e_tests/e2eUtils.ts

@@ -60,7 +60,7 @@ export class CypherEditorPage {
  }

  async checkNoNotificationMessage(type: 'error' | 'warning') {
-    await this.page.waitForTimeout(1000);
+    await this.page.waitForTimeout(3000);


This was giving flakiness

ncordon · 2025-01-06T15:30:19Z

packages/react-codemirror/src/e2e_tests/performanceTest.spec.tsx

@@ -100,7 +100,7 @@ test('benchmarking & performance test session', async ({

  await editorPage.checkErrorMessage(
    'RETRN',
-    'Unexpected token. Did you mean RETURN?',
+    `Invalid input 'RETRN': expected a graph pattern, 'FOREACH', ',', 'ORDER BY', 'CALL', 'CREATE', 'LOAD CSV', 'DELETE', 'DETACH', 'FINISH', 'INSERT', 'LIMIT', 'MATCH', 'MERGE', 'NODETACH', 'OFFSET', 'OPTIONAL', 'REMOVE', 'RETURN', 'SET', 'SKIP', 'UNION', 'UNWIND', 'USE', 'USING', 'WHERE', 'WITH' or <EOF>`,


Some errors like this one have significantly worsened, I kind of liked the old one better but it's not easy to fix this without resorting to parsing the error, getting the tokens, etc, which doesn't seem very robust.

Yeah, like we mentioned in the call yesterday, it would probably be nicest if we moved some of the syntax validation logic that caused the better messages to happen before to now be executed in the semantic analysis

ncordon · 2025-01-06T15:32:16Z

packages/react-codemirror/src/e2e_tests/syntaxValidation.spec.tsx

+  await expect(
+    editorPage.page.locator('.cm-deprecated-element').last(),
+  ).toBeVisible({ timeout: 3000 });
+  await editorPage.checkWarningMessage('id', 'Function id is deprecated.');


Formatting changes 🤷

ncordon · 2025-01-06T15:32:38Z

packages/react-codemirror/src/e2e_tests/syntaxValidation.spec.tsx

+  await expect(
+    editorPage.page.locator('.cm-deprecated-element').last(),
+  ).toBeVisible({ timeout: 3000 });

  await editorPage.checkWarningMessage(
    'apoc.create.uuids',
-    "Procedure apoc.create.uuids is deprecated.",
+    'Procedure apoc.create.uuids is deprecated.',
  );


Formatting changes 🤷 . Probably because this file was modified and it was reformatted

anderson4j

Had some spots where I got confused but didnt find any errors. Here might be a good place to confirm I got things correctly. We now get syntax errors from the transpiled semanticAnalysis.js right? (So change is - remove our own syntax error creation + hook in syntax errors from semanticAnalysis)

anderson4j · 2025-01-08T08:47:17Z

packages/language-support/src/syntaxValidation/syntaxValidation.ts

-    let token: Token | undefined = undefined;
-
-    const start = Position.create(
-      e.position.line - 1 + cmd.start.line - 1,


New method looks at e.range.start instead of e.position, but otherwise looks similar. Adjust here is (cmd.start.line - 2) though, while there it is only (cmd.start.line -1). Is this some intended change from switching to semanticDiagnostic over semanticElement?

Tried switching to ...line - 2 here too, and got line = -1 in some tests, so I guess intended. Still a bit confused why we needed -1x2 before and -1 now

Because lines and columns start at 1 for antlr (and I think for the semantic errors in the database), so we were returning those. I adjusted the ones in the transpiled semantic anallysis so we don't have to worry about those from our side.

anderson4j · 2025-01-08T08:53:57Z

packages/language-support/src/syntaxValidation/syntaxValidation.ts

-
-    const start = Position.create(
-      e.position.line - 1 + cmd.start.line - 1,
-      e.position.column - 1 + (e.position.line === 1 ? cmd.start.column : 0),


similar thing here, we check if e.position.line === 1 here, while the new checks range.start.line === 0

anderson4j · 2025-01-08T09:13:32Z

packages/language-support/src/syntaxValidation/syntaxValidation.ts

-    const startOffset = e.position.offset + cmd.start.start;
-    const toExplore: ParseTree[] = [parseResult.ctx];
-
-    while (toExplore.length > 0) {


Comparing this to the new, it seems like we're now getting the end position from the semantic analysis, and thus don't need to search the parseTree[] for it?

Correct, I've moved that down to the semantic analysis, because we have a proper parsing there (cypher 5 and cypher 25), whereas in the Javascript side we'll only maintain the Cypher 25 parser

anderson4j · 2025-01-08T09:51:43Z

packages/react-codemirror/src/e2e_tests/performanceTest.spec.tsx

@@ -100,7 +100,7 @@ test('benchmarking & performance test session', async ({

  await editorPage.checkErrorMessage(
    'RETRN',
-    'Unexpected token. Did you mean RETURN?',
+    `Invalid input 'RETRN': expected a graph pattern, 'FOREACH', ',', 'ORDER BY', 'CALL', 'CREATE', 'LOAD CSV', 'DELETE', 'DETACH', 'FINISH', 'INSERT', 'LIMIT', 'MATCH', 'MERGE', 'NODETACH', 'OFFSET', 'OPTIONAL', 'REMOVE', 'RETURN', 'SET', 'SKIP', 'UNION', 'UNWIND', 'USE', 'USING', 'WHERE', 'WITH' or <EOF>`,


Yeah, like we mentioned in the call yesterday, it would probably be nicest if we moved some of the syntax validation logic that caused the better messages to happen before to now be executed in the semantic analysis

anderson4j · 2025-01-08T10:49:23Z

packages/language-support/src/tests/syntaxValidation/syntacticValidation.test.ts

        offsets: {
-          end: 114,


Think we mentioned these changes in offsets yesterday, but I can't quite remember - why do they happen again?

I've managed to fix a few of these. They were happening because syntax errors in the database just carry the start position.

Adding the end position for all of those is a really big task, so I've managed to rescue the tokens we lexed and finding the token that corresponds to this position. It works better for most of the cases I'd say.

ncordon · 2025-01-08T23:37:01Z

packages/language-support/src/tests/syntaxValidation/syntacticValidation.test.ts

-  test('Misspelt keyword in the middle of the statement', () => {
-    const query = "MATCH (n:Person) WERE n.name = 'foo'";


This test was repeated

Moves errors to the semantic analysis worker

069a006

ncordon added 4 commits December 31, 2024 10:18

Fixes failing e2e react codemirror tests

24cea9b

Fixes more syntactic and semantic errors

cf1409e

Fixes rest of the unit tests

31a3366

Merge remote-tracking branch 'origin/main' into errors-semantic-analy…

10478da

…sis-worker

ncordon commented Jan 6, 2025

View reviewed changes

Fixes e2e test

09cb9b6

ncordon force-pushed the errors-semantic-analysis-worker branch from f1700c6 to 09cb9b6 Compare January 6, 2025 15:03

ncordon commented Jan 6, 2025

View reviewed changes

ncordon changed the title ~~Moves errors to the semantic analysis worker~~ Moves syntax errors to the semantic analysis worker Jan 6, 2025

ncordon commented Jan 6, 2025

View reviewed changes

ncordon requested a review from anderson4j January 7, 2025 09:40

ncordon assigned anderson4j Jan 7, 2025

Cleans up

1558eea

anderson4j approved these changes Jan 8, 2025

View reviewed changes

Fixes some more positions

94d289a

ncordon commented Jan 8, 2025

View reviewed changes

ncordon merged commit 043d766 into main Jan 9, 2025
4 checks passed

ncordon deleted the errors-semantic-analysis-worker branch January 9, 2025 10:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Moves syntax errors to the semantic analysis worker #317

Moves syntax errors to the semantic analysis worker #317

ncordon commented Dec 23, 2024 •

edited

Loading

changeset-bot bot commented Dec 23, 2024 •

edited

Loading

ncordon Jan 6, 2025 •

edited

Loading

ncordon Jan 6, 2025

ncordon Jan 6, 2025

ncordon Jan 6, 2025

ncordon Jan 6, 2025 •

edited

Loading

ncordon Jan 6, 2025

ncordon Jan 6, 2025

ncordon Jan 6, 2025

ncordon Jan 6, 2025 •

edited

Loading

anderson4j Jan 8, 2025

ncordon Jan 6, 2025

ncordon Jan 6, 2025

anderson4j left a comment

anderson4j Jan 8, 2025

anderson4j Jan 8, 2025

ncordon Jan 8, 2025

anderson4j Jan 8, 2025

anderson4j Jan 8, 2025

ncordon Jan 8, 2025

anderson4j Jan 8, 2025

anderson4j Jan 8, 2025

ncordon Jan 8, 2025

ncordon Jan 8, 2025 •

edited

Loading

		test('Misspelt keyword in the middle of the statement', () => {
		const query = "MATCH (n:Person) WERE n.name = 'foo'";

Moves syntax errors to the semantic analysis worker #317

Moves syntax errors to the semantic analysis worker #317

Conversation

ncordon commented Dec 23, 2024 • edited Loading

What

Why

changeset-bot bot commented Dec 23, 2024 • edited Loading

🦋 Changeset detected

ncordon Jan 6, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ncordon Jan 6, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ncordon Jan 6, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anderson4j left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ncordon Jan 8, 2025 • edited Loading

Choose a reason for hiding this comment

ncordon commented Dec 23, 2024 •

edited

Loading

changeset-bot bot commented Dec 23, 2024 •

edited

Loading

ncordon Jan 6, 2025 •

edited

Loading

ncordon Jan 6, 2025 •

edited

Loading

ncordon Jan 6, 2025 •

edited

Loading

ncordon Jan 8, 2025 •

edited

Loading