-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix splitCssText again #1640
Fix splitCssText again #1640
Conversation
…n with `isAttachIframe`' test - it was working for me when the test was run in isolation (`-t` option), but when the entire cross-origin-iframes test was run, the change of iframe contents didn't seem to happen in time
… and we end up not finding a unique one - we should just go with the first one (note: this is still not binary search so could exhibit pathological behaviour)
… (Posthog) ... see comment from MartinWorkfully: PostHog/posthog-js#1668
🦋 Changeset detectedLatest commit: 5743c7e The changes in this PR will be included in the next version bump. This PR includes changesets to release 19 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
const prevTextContent = childNodes[i - 1].textContent; | ||
if (prevTextContent && typeof prevTextContent === 'string') { | ||
// pick the first matching point which respects the previous chunk's approx size | ||
const prevMinLength = normalizeCssString(prevTextContent).length; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i guess it's somewhere here that means if you run a test twice (or some multiple times) then you don't get the same output
(at least in my experience of testing whether this was deterministic when trying to figure out what was happening)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand this comment, but I would say the algorithm is indeed deterministic, but maybe you mean it will behave differently based on different sized inputs because of the jLimit
bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, i wrote a test that ran the split multiple times and compared the output and it didn't match
was generally whitespace ending up on different sides of a split
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay, yes this is possible ... basically it stops when the normalised versions match (although I still think it would be deterministic given the same input twice). I can't imagine there being a problem if one side has more whitespace than it should have.
if (_testNoPxNorm) { | ||
return cssText.replace(/(\/\*[^*]*\*\/)|[\s;]/g, ''); | ||
} else { | ||
return cssText.replace(/(\/\*[^*]*\*\/)|[\s;]/g, '').replace(/0px/g, '0'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i briefly tested a very naive loop parser and it was slower than regex replace - i guess because browsers/v8 are doing some magic to optimise this already
but I didn't test it over a range of inputs
this is (from my testing) at best O(n) for whitespace - and for clarity since i'm not much of a comp sci person. if you insert whitespace into the input then this gets slower the more whitespace is present
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The performance of the normalization function could be improved. I've moreso tried to ensure it's not called repeatedly on the same piece of css (with the 'binary search' style changes in #1615 ).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am really not sure how i could roll this out in prod
the last version basically doubled our support load and i'm ending this week exhausted as a result
i have no idea of how to test if this fixes for all cases or just for a specific case
we seem to be writing a css parser, and i'm wondering if adopting a css parser would be safer
We're not attempting to write a CSS parser at record time, we are using In most cases, the The mutation issues that this splitting solves are also mostly theoretical, so it's possible to patch/short-circuit the
I appreciate that the algorithm implemented here is not simple; I've thought about abstracting it out to a third party library "split a string according to an array of related substrings which can be matched via a normalization function" ... I haven't looked into whether such a thing already exists. This PR is definitely an improvement and I believe catches the last pathological case, particularly as now there is now an additional There is another large PR to move CSS parsing off the main thread at record time, however that would likely have hidden this problem rather than bringing it to the fore so painfully. I've also another plan to ditch the whole |
@@ -463,19 +470,24 @@ export function normalizeCssString(cssText: string): string { | |||
export function splitCssText( | |||
cssText: string, | |||
style: HTMLStyleElement, | |||
_testNoPxNorm = false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be great to get some tsdoc
documentation as to what this does, especially since _variableName
normally means: an unused variable, in JS/TS land
See also PostHog/posthog-js#1668 for the downstream bug report, and further discussion in Slack (Thanks Paul D'Ambra for report and pointers)
This is a further improvement after performance fixes in #1615
This covers new scenarios as outlined in the tests. Test cases were recreated from a very large inline style node in https://hiring.workfully.com/signin which looks like it was in a shadow root
:host(.productfruits--container)
although I can't quite find it there now. I pulled the examples files from a breakpoint and have them locally, but the test cases here incorporate the important bit, including the split in the middle of a statement.Ultimately the problem with the content which triggered this case was that
margin-top: 0;
as authored, gets serialized tomargin-top: 0px;
, which was preventing us finding the right point to split between normalized/unnormalized.