Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Add new style-value-parser package #743

Draft
wants to merge 31 commits into
base: main
Choose a base branch
from
Draft

Conversation

nmn
Copy link
Contributor

@nmn nmn commented Oct 11, 2024

Background

We're currently using postcss-value-parser for a few purposes but it is quite rudimentary.

lightningcss was considered but its API only parses entire CSS files and not individual values. It would also be a very heavy tool for what we need as it's an entire CSS post-processor. It would also allow all valid CSS and it would be more difficult to manually add our own constraints to it.

New CSS Value Parser

This PR introduces a new parser which have been tailored for StyleX:

  • It will parse all CSS types
  • We can define our own constraints for the values we will accept for various properties.
    • We can explicitly disallow multi-value shorthands
    • We can enforce a certain order for values within properties that can be written in arbitrary order, such as box-shadow.
  • It is pure-JS
  • Its .toString() methods will output normalised strings for various values
    • Over time, we can even convert certain types. (we are starting by being conservative)
      • e.g. we could convert:
        • named_colors to hashes
        • rgb to hashes
        • units such as cm and inch to px.
        • etc...

How it will be used

This parser will be used to :

  • validate styles in the ESLint plugin (where our checks are overly permissive)
  • validate styles within both the Babel parser
  • normalise values within the Babel parser to reduce variance and have fewer duplicate styles
  • Be used to parse strings such as @media queries in order to add additional guarantees for them.

How complete is it?

There should already be parser for every CSS type.

There are a few style property specific parsers that composing together the "type" parsers. These are there as examples to get us started. These need to be updated to allow CSS variables being used in various positions.

There are many unit tests, but test coverage could be improved. Specially for the toString methods.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 11, 2024
Copy link

github-actions bot commented Oct 11, 2024

workflow: benchmarks/size

Comparison of minified (terser) and compressed (brotli) size results, measured in bytes. Smaller is better.

@stylexjs/[email protected] size:compare
./size-compare.js /tmp/tmp.k9eZhCOmn0 /tmp/tmp.EnpeTS7RRX

Results Base Patch Ratio
stylex/lib/stylex.js
· compressed 985 985 1.00
· minified 3,154 3,154 1.00
stylex/lib/StyleXSheet.js
· compressed 1,266 1,266 1.00
· minified 3,776 3,776 1.00
rollup-example/.build/bundle.js
· compressed 567,170 567,170 1.00
· minified 10,232,457 10,232,457 1.00
rollup-example/.build/stylex.css
· compressed 100,609 100,609 1.00
· minified 755,721 755,721 1.00

@nmn
Copy link
Contributor Author

nmn commented Feb 10, 2025

I have rebased and squashed this branch to two commits. The first commit is everything that was in this PR so far.

The second commit and future commits will be used as an experiment to try to use a tokenizer to speed up the parser. So far, the build performance has been the blocker to shipping this parser.

@nmn
Copy link
Contributor Author

nmn commented Feb 11, 2025

I feel fairly confident that the new parser implementation is fast enough:

> @stylexjs/[email protected] benchmark
> babel-node src/__benchmarks__/alpha-value.bench.js

Original Parser x 692 ops/sec ±4.18% (43 runs sampled)
Token Parser x 18,246 ops/sec ±0.44% (86 runs sampled)
Fastest is Token Parser

@nmn
Copy link
Contributor Author

nmn commented Feb 13, 2025

Why Parser

  1. We use postcss-value-parser today to pre-process and format style values.
  • This parser kinda sucks, we want to do better
  • e.g. change rgb colors to hash colors
  • changing named colors to hash colors
  • changing modern CSS syntax like
    • rgb(200 200 200 / 50%)
    • to better supported syntax like rgb(200,200,200,0.5)
  1. Better/Stricter validation in the Babel plugin
  2. Better validation in valid-styles rule
  3. Parse Media Queries, To support two features
    • Automatically negative media queries that come later
    • Sharable constants (like media queries)
      • Media Query simplification

How

  1. Old parser consumed one character at a time
  2. New parser builds on @csstools/css-tokenizer and consumes one token at a time.

@nmn
Copy link
Contributor Author

nmn commented Feb 15, 2025

Some interesting data from adding more benchmarks:

<length>

Legacy Parser x 326 ops/sec ±5.40% (48 runs sampled)
Token Parser x 1,918,503 ops/sec ±2.40% (95 runs sampled)
Fastest is Token Parser, 5892.96x faster than Legacy Parser

<frequency>

Legacy Parser x 11,223 ops/sec ±3.46% (46 runs sampled)
Token Parser x 2,391,566 ops/sec ±0.23% (95 runs sampled)
Fastest is Token Parser, 212.09x faster than Legacy Parser

<flex>

Legacy Parser x 10,284 ops/sec ±3.87% (41 runs sampled)
Token Parser x 3,328,397 ops/sec ±1.55% (96 runs sampled)
Fastest is Token Parser, 322.65x faster than Legacy Parser

<dimension>

Legacy Parser x 307 ops/sec ±5.45% (36 runs sampled)
Token Parser x 3,395,021 ops/sec ±0.26% (99 runs sampled)
Fastest is Token Parser, 11066.03x faster than Legacy Parser

<color>

Legacy Parser x 1,232 ops/sec ±7.48% (36 runs sampled)
Token Parser x 103,229 ops/sec ±0.26% (94 runs sampled)
Fastest is Token Parser, 82.79x faster than Legacy Parser

<filter-function>

Legacy Parser x 242 ops/sec ±3.57% (46 runs sampled)
Token Parser x 39,571 ops/sec ±1.49% (94 runs sampled)
Fastest is Token Parser, 162.73x faster than Legacy Parser

<blend-mode>

Legacy Parser x 42,550,779 ops/sec ±1.26% (94 runs sampled)
Token Parser x 4,519,588 ops/sec ±0.23% (94 runs sampled)
Fastest is Legacy Parser, 8.41x faster than Token Parser

<calc-constant>

Legacy Parser x 59,225 ops/sec ±6.11% (43 runs sampled)
Token Parser x 48,576 ops/sec ±0.39% (87 runs sampled)
Fastest is Legacy Parser, 0.22x faster than Token Parser

<percentage>

Legacy Parser x 12,401 ops/sec ±4.24% (43 runs sampled)
Token Parser x 8,330,382 ops/sec ±1.12% (97 runs sampled)
Fastest is Token Parser, 670.76x faster than Legacy Parser

<easing-function>

Legacy Parser x 1,097 ops/sec ±3.59% (43 runs sampled)
Token Parser x 6,847 ops/sec ±0.54% (84 runs sampled)
Fastest is Token Parser, 5.24x faster than Legacy Parser

<angle>

Legacy Parser x 11,766 ops/sec ±3.86% (45 runs sampled)
Token Parser x 2,546,599 ops/sec ±0.29% (93 runs sampled)
Fastest is Token Parser, 215.43x faster than Legacy Parser

<angle-percentage>

Legacy Parser x 10,466 ops/sec ±3.78% (42 runs sampled)
Token Parser x 3,757,365 ops/sec ±0.23% (96 runs sampled)
Fastest is Token Parser, 358.01x faster than Legacy Parser

<custom-ident>

Legacy Parser x 2,844 ops/sec ±4.76% (40 runs sampled)
Token Parser x 2,127,567 ops/sec ±0.18% (99 runs sampled)
Fastest is Token Parser, 747.21x faster than Legacy Parser

<length-percentage>

Legacy Parser x 301 ops/sec ±3.88% (46 runs sampled)
Token Parser x 135,377 ops/sec ±0.99% (94 runs sampled)
Fastest is Token Parser, 448.11x faster than Legacy Parser

<dashed-ident>

Legacy Parser x 2,460 ops/sec ±3.98% (41 runs sampled)
Token Parser x 2,034,030 ops/sec ±0.27% (93 runs sampled)
Fastest is Token Parser, 825.97x faster than Legacy Parser

<basic-shape>

Legacy Parser x 226 ops/sec ±4.37% (44 runs sampled)
Token Parser x 6,526 ops/sec ±0.29% (78 runs sampled)
Fastest is Token Parser, 27.93x faster than Legacy Parser

<alpha-value>

Legacy Parser x 13,578 ops/sec ±2.99% (50 runs sampled)
Token Parser x 148,060 ops/sec ±0.31% (89 runs sampled)
Fastest is Token Parser, 9.90x faster than Legacy Parser


It looks like the legacy parser is faster at parsing exact strings, but the new parser is a LOT faster at everything else.

I'll investigate if there is a simple optimization to be made here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants