refactor(chevrotain): set noImplicitAny: true and fix errors #1688

NaridaL · 2021-10-27T16:26:33Z

No description provided.

packages/chevrotain/src/parse/grammar/checks.ts

packages/chevrotain/src/scan/lexer_public.ts

packages/chevrotain/src/parse/parser/traits/parser_traits.ts

packages/chevrotain/tsconfig.json

packages/chevrotain/src/parse/parser/traits/recognizer_engine.ts

packages/chevrotain/src/parse/parser/traits/recoverable.ts

packages/chevrotain/src/scan/lexer_public.ts

packages/chevrotain/test/parse/cst_spec.ts

packages/chevrotain/test/scan/lexer_spec.ts

bd82 · 2021-10-28T20:56:58Z

Okay @NaridaL there are a couple of comments that needs to be resolved and this can be merged.

NaridaL · 2021-10-29T16:14:24Z

@bd82 This work for you?

…

On Fri, Oct 29, 2021, 01:09 Shahar Soel ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In packages/chevrotain/src/parse/parser/traits/parser_traits.ts <#1688 (comment)> : > @@ -35,6 +35,11 @@ export type MixedInParser = ParserConstructorImpel & GastRecorder & PerformanceTracer +export type ParserMethod<T> = ((idx: number, args?: any[]) => T) & { It is very much needed. It is used at *runtime* to understand the "grammar location" inside the specific rule and thus use the correct lookahead logic to use. Basically, to distinguish between OPTION1 vs OPTION2 vs OPTION-N instead the same rule. Here is where the unique ID for the "grammar location" is built: - https://github.com/Chevrotain/chevrotain/blob/master/packages/chevrotain/src/parse/grammar/keys.ts - Note the bit operations as they ended up much faster than concating strings. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1688 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACSTYSQWXA7N5TX675PZUVDUJHQ2PANCNFSM5G2YYATA> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

bd82 · 2021-10-30T12:57:57Z

packages/types/api.d.ts

@@ -875,21 +880,21 @@ export declare class CstParser extends BaseParser {
  /**
   * Creates a Grammar Rule
   */
-  protected RULE(
+  protected RULE<ARGS extends unknown[]>(


What is the purpose of "extends unknown" here?

<T extends unknown> does not narrow microsoft/TypeScript#38456

Would the generic argument gets resolved / inferred correctly according to the provided implementation?

Meaning would a SUBRULE call that provides ARGS have type checks on the ARGS depending on which rule is invoked?

extends "unknown array". This is needed to do ...args: ARGS. ... which I wasn't doing. Should be fixed now.

The inference should work as expected yes.

packages/types/api.d.ts

examples/parser/parametrized_rules/parametrized.js

packages/chevrotain/src/parse/parser/traits/recognizer_engine.ts

bd82 · 2021-10-30T13:31:13Z

packages/chevrotain/src/parse/parser/traits/recognizer_engine.ts

      try {
+        this.ruleInvocationStateUpdate(shortName, ruleName, this.subruleIdx)


now that you simplified this logic, we may be able to to get rid of the conditional related to outputCst

But not in this PR its getting large anyhow

packages/chevrotain/src/parse/parser/traits/recognizer_engine.ts

bd82 · 2021-10-30T13:35:01Z

packages/chevrotain/src/parse/parser/traits/recognizer_engine.ts

      try {
+        this.ruleInvocationStateUpdate(shortName, ruleName, this.subruleIdx)
+        this.subruleIdx = 0


why do we need to reset this.subruleIdx = 0' ? won't the next SUBRULE` invocation take care of it?

invokeRuleWithTry is a hotspot, so we want to avoid any none mandatory statements.

I believe there is a reset method on this class/trait where it should be reset to 0 instead.

I've got it in init, it's not strictly required in reset, as top-level rule calls should work with any idx right?

Yeah, good point, it may not be strictly required.
I may add it later though to be explicit and be more consistent between different invocations (inputs) on the same parser instance.

packages/chevrotain/src/parse/parser/parser.ts

bd82 · 2021-10-30T13:52:17Z

packages/chevrotain/src/parse/parser/traits/recognizer_engine.ts

    let ruleResult
    try {
      const args = options !== undefined ? options.ARGS : undefined
-      ruleResult = ruleToCall.call(this, idx, args)
+      this.subruleIdx = idx
+      ruleResult = ruleToCall.apply(this, args)


I wonder if call vs apply would have different performance here

In invokeWithTry we are using the spread operator and then calling apply again.
So it seems we are "packing" and "unpacking" the ARGS too many times.

invokeRuleWithTry(this: MixedInParser, ...args: ARGS[])

If we remove the spread operator there, we could then keep this line with the possibly faster call and avoid un-needed packs/unpacks ?

Yes, but then the result of RULE wouldn't have the same signature as the implementation definition, which is the most intuitive. As there are going to be no arguments in most cases, I think a minor performance impact would be worth it.

I will investigate the performance impact (if any) after we merge.

bd82 · 2021-10-30T14:03:11Z

@bd82 This work for you?

Hi @NaridaL

I really like the simplification of the Idx parameter 👍
I left some comments related to performance as there are changes in a hotspot.
I will attempt to benchmark some variations by applying only the recognizer-engine changes on-top of v9.1.0 tag.
Although some differences may be too small to measure, so we will have to use common sense 😄

I will also try the api.d.ts changes on a Parser implemented in TypeScript, so I can understand the limits of the type inference capabilities and the usage of extends unknown

NaridaL · 2021-10-30T16:26:12Z

I've worked it the comments. I also changed the typings on RULE slightly again, so that all parameters must be optional. I.e. must be assignable to () => any.

packages/chevrotain/test/parse/recognizer_spec.ts

bd82 · 2021-10-30T18:05:24Z

I've worked it the comments. I also changed the typings on RULE slightly again, so that all parameters must be optional. I.e. must be assignable to () => any.

The new signatures look much better, and the use of Parameters mapped type is also very clear and understandable.

One question,
Is forcing the parameters to be optional is due a limitation of TypeScript or a personal preference?

bd82 · 2021-10-30T18:09:42Z

Does this PR now replace #1672 ?

NaridaL · 2021-10-30T18:10:09Z

One question, Is forcing the parameters to be optional is due a limitation of TypeScript or a personal preference?

It's by design, mainly because args will be undefined during grammar recording.

NaridaL · 2021-10-30T18:10:37Z

Does this PR now replace #1672 ?

Mostly, I will have a look at it once this is merged.

bd82 · 2021-10-30T18:12:55Z

I'm merging this.
I may make micro changes in a different PR and ping you for review:

performance evaluations.
reset of idx
Can the params be none-optional

bd82 · 2021-10-30T19:02:53Z

Looks like this PR provided a nice performance boost.
v9.1.0 vs current master version is: (I think the difference is mostly from the removal of the default value for idx in the rule wrapper)

NaridaL · 2021-10-30T19:08:50Z

Hmm I suspect the improvement is from the fact there is now only one wrapper lambda instead of one...

I tried to profile the benchmark in chrome but didn't get it to work as the parser itself runs in an iframe...

bd82 · 2021-10-30T19:16:39Z

Hmm I suspect the improvement is from the fact there is now only one wrapper lambda instead of one...

Could definitively also affect this.
I also have some dim recollection of that default value cause some slight performance regression.

I tried to profile the benchmark in chrome but didn't get it to work as the parser itself runs in an iframe...

I recall I kept trying more and more ways to isolate the benchmarks, due to strange behavior and results.
This latest vs Dev benchmark not only runs in an iframe but even uses WebWorkers... 😄

bd82 · 2021-10-30T19:24:18Z

I am now inspecting the cost of the spread operator in the invokeWithTry
It gets compiled to :

function invokeRuleWithTry() {
            var args = [];
            for (var _i = 0; _i < arguments.length; _i++) {
                args[_i] = arguments[_i];
            }
// ...
}

Which is obviously ineffective, but 99% of the time there are no arguments, so only a single additional statement would be executed.

On the other hand I recall some versions of V8 had problems optimizing functions which uses the arguments special variable, but perhaps this is no longer an issue in modern versions of V8.

bd82 · 2021-10-30T19:40:15Z

I compare master version, versus one that changes the generated code to use the spread operator:

 function invokeRuleWithTry(...args) {
// ...
}

I get some very interesting results:

nice boost for JSON / CSS scenarios
A regression (back to v9.1.0 range) in the ECMAScript scenario.

I am guessing there is some magic combination of factors that can enable V8 to optimize the ECMAScript grammar better.
So not much I can do about that. But the performance boost in the other scenarios is small but none negligible.
I wonder if I should try to change to compilation target to ES2015 again.

bd82 · 2021-10-30T20:00:56Z

Removing the spread operator and replacing the apply in subruleInternal with call produces a
similar (but slightly smaller) performance boost (3-5% instead of 4-6%).

I wonder what should be done:

Keep it as is and "give up" on ~5% parser (without lexer) performance boost, yet gain a large performance boost when rule parameters are used (ECMA5 grammar).
Try to change target to ES2015, but that resulted in issues and even a slight performance degradation last time I attempted it:
- Evaluate targeting more modern runtimes. #1272 (comment)
Perform some regexp text replacements hacks to use ES2015 spread operator instead of the TSC transpiled version.
- May break source maps
- May break the bundle which has no automated tests.

At the moment I think we should just enjoy the small boost we already gained.
and perhaps the large boost in complex grammars that use the parametrized rules is worth
it over a small boost in simpler grammars.

For reference I am using Chrome 97 for testing.

NaridaL · 2021-10-30T21:02:47Z

Definitely not 3., that seems very difficult to maintain and prone to break if TypeScript or other tools change things.

I've tested it, as far as I can see there are only 3-4 problems to fix to get ES2015 to work. As for performance degradation, profiling might show where the problem is with the ES2015 version.

bd82 · 2021-10-30T21:42:36Z

The main problem with target ES2015 is that it seems to break prototype inheritance from Chevrotain classes.
Maybe its not a huge issue if we are targeting ES2015 though...
But it does mean more examples and such would have to be fixed.

NaridaL commented Oct 27, 2021

View reviewed changes

packages/chevrotain/src/parse/grammar/checks.ts Outdated Show resolved Hide resolved

NaridaL commented Oct 27, 2021

View reviewed changes

packages/chevrotain/src/parse/grammar/checks.ts Show resolved Hide resolved

NaridaL force-pushed the dev/chevrotain-noImplicitAny branch 2 times, most recently from b31a5a7 to c0c6ab2 Compare October 27, 2021 16:34

refactor(chevrotain): set noImplicitAny: true and fix errors

76141af

NaridaL force-pushed the dev/chevrotain-noImplicitAny branch from c0c6ab2 to 76141af Compare October 27, 2021 16:35

NaridaL commented Oct 27, 2021

View reviewed changes

packages/chevrotain/src/scan/lexer_public.ts Show resolved Hide resolved

packages/chevrotain/src/parse/parser/traits/parser_traits.ts Outdated Show resolved Hide resolved

bd82 reviewed Oct 27, 2021

View reviewed changes

packages/chevrotain/tsconfig.json Show resolved Hide resolved

bd82 reviewed Oct 27, 2021

View reviewed changes

packages/chevrotain/src/parse/parser/traits/recognizer_engine.ts Show resolved Hide resolved

bd82 reviewed Oct 27, 2021

View reviewed changes

packages/chevrotain/src/parse/parser/traits/recoverable.ts Show resolved Hide resolved

bd82 reviewed Oct 27, 2021

View reviewed changes

packages/chevrotain/src/scan/lexer_public.ts Show resolved Hide resolved

bd82 reviewed Oct 27, 2021

View reviewed changes

packages/chevrotain/test/parse/cst_spec.ts Show resolved Hide resolved

bd82 reviewed Oct 27, 2021

View reviewed changes

packages/chevrotain/test/scan/lexer_spec.ts Show resolved Hide resolved

NaridaL added 2 commits October 28, 2021 23:15

refactor: move flatMap to utils

074875f

refactor: remove explicit idx parameter to ParserMethod

0af7457

NaridaL force-pushed the dev/chevrotain-noImplicitAny branch from 81f9638 to 0af7457 Compare October 29, 2021 15:51