feat: add TAR parser #33975

Skn0tt · 2024-12-12T11:19:55Z

Adds a TAR parser + basic test based on Max's draft. We'll use this in the Brotli-downloader, so we don't have to add dependencies.

Copilot

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 suggestion.

Comments skipped due to low confidence (1)

packages/playwright-core/src/utils/tar.ts:78

The chunk method does not handle the case where the fileStream is null properly. Add a check to ensure fileStream is not null before writing to it.

assert(this.fileStream);

packages/playwright-core/src/utils/tar.ts

Skn0tt · 2024-12-13T12:13:42Z

@mxschmitt i've added another test that uses one of our chromium builds. Looks like the extractor is working well!

mxschmitt

I had to make the following 3 changes in order to make it work with chrome-mac-arm64.tar.br. Not a huge fan of downloading the entire browser during the test-run - imo its fine to not have a unit test-coverage on it - since when we use it the bots will give us enough confidence / coverage. Lets maybe squeeze it into the actual PR when we implement br/tar file format? Also thinking if we should verify the checksums.

packages/playwright-core/src/utils/tar.ts

mxschmitt · 2024-12-14T19:40:29Z

packages/playwright-core/src/utils/tar.ts

+  }
+
+  async writeToDisk(outputPath: (path: string) => string) {
+    const fullPath = outputPath(this.name);


why is outputPath a callback? Ideally its a string and we should add a check so that users can't write 'outside' of the outputPath.

I made this a callback instead of a raw string, so that the calling side can implement logic like this:

https://gist.github.com/mxschmitt/0807f729ba68358e58bf5210b4607fdc#file-har-extract-mjs-L71-L76

It seemed important that we can do that, but at the same time it also didn't seem like a property of the TAR extraction, so it can live outside of it now.

Skn0tt · 2024-12-16T08:09:41Z

Good catch on the "prefix" field. Reading through the spec, this seemed like a legacy field to me so I omitted it initially. Added it back in 2c9bbf7.

Skn0tt · 2024-12-16T08:16:59Z

Lets maybe squeeze it into the actual PR when we implement br/tar file format?

I'd prefer merging it individually, so the TAR PR doesn't become insanely big.

Not a huge fan of downloading the entire browser during the test-run - imo its fine to not have a unit test-coverage on it - since when we use it the bots will give us enough confidence / coverage.

Makes sense! I'll remove the unit tests before merging then.

Also thinking if we should verify the checksums.

The TAR spec says The chksum field represents the simple sum of all bytes in the header block.. So it wouldn't verify the body correctness, just the headers. I'm torn wether it's worth it to implement that.

github-actions · 2024-12-16T08:52:32Z

Test results for "tests 1"

10 flaky

⚠️ [firefox-page] › page/page-evaluate.spec.ts:403:3 › should throw for too deep reference chain @firefox-ubuntu-22.04-node18
⚠️ [webkit-library] › library/inspector/cli-codegen-1.spec.ts:147:7 › cli codegen › should make a positioned click on a canvas @webkit-ubuntu-22.04-node18
⚠️ [webkit-library] › library/proxy.spec.ts:93:11 › should proxy local network requests › by default › localhost @webkit-ubuntu-22.04-node18
⚠️ [webkit-library] › library/selector-generator.spec.ts:313:7 › selector generator › should prioritize attributes correctly › role @webkit-ubuntu-22.04-node18
⚠️ [webkit-library] › library/trace-viewer.spec.ts:278:1 › should have correct snapshot size @webkit-ubuntu-22.04-node18
⚠️ [webkit-library] › library/trace-viewer.spec.ts:1350:1 › should remove noscript when javaScriptEnabled is set to true @webkit-ubuntu-22.04-node18
⚠️ [webkit-library] › library/trace-viewer.spec.ts:1474:1 › should not leak recorders @webkit-ubuntu-22.04-node18
⚠️ [webkit-library] › library/trace-viewer.spec.ts:1554:1 › should show only one pointer with multilevel iframes @webkit-ubuntu-22.04-node18
⚠️ [webkit-library] › library/tracing.spec.ts:335:5 › should collect sources @webkit-ubuntu-22.04-node18
⚠️ [webkit-page] › page/page-set-input-files.spec.ts:147:3 › should upload large file @webkit-ubuntu-22.04-node18

37298 passed, 662 skipped
✔️✔️✔️

Merge workflow run.

Skn0tt · 2024-12-17T12:50:03Z

we discovered some problems of this approach around tars with long filenames, and chose to go with a 3rd-party implementation for the time being.

Skn0tt and others added 12 commits December 10, 2024 20:40

commit example folder

193902a

add tar file

3fe81dd

Add TAR parser

33cdb9c

update and test

a526cb9

serialize async operations

c6e8d20

simplify

0974981

simplify more

ca345d9

simplify even more

1b12147

align parser with spec

ba69bc5

add comment about queueing

df3f1b9

buffer up earlier so that written chunks become larger

001c917

remove demo folder

f3e651a

Skn0tt requested review from mxschmitt and Copilot December 12, 2024 11:19

Skn0tt self-assigned this Dec 12, 2024

Copilot AI reviewed Dec 12, 2024

View reviewed changes

packages/playwright-core/src/utils/tar.ts Show resolved Hide resolved

This comment has been minimized.

Sign in to view

add another test

a538214

This comment has been minimized.

Sign in to view

mxschmitt reviewed Dec 14, 2024

View reviewed changes

packages/playwright-core/src/utils/tar.ts Show resolved Hide resolved

packages/playwright-core/src/utils/tar.ts Show resolved Hide resolved

packages/playwright-core/src/utils/tar.ts Outdated Show resolved Hide resolved

mxschmitt reviewed Dec 14, 2024

View reviewed changes

respect "prefix" field

2c9bbf7

Skn0tt closed this Dec 17, 2024

Skn0tt mentioned this pull request Dec 19, 2024

feat: download browsers as TAR #34033

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add TAR parser #33975

feat: add TAR parser #33975

Skn0tt commented Dec 12, 2024

Copilot AI left a comment

This comment has been minimized.

Skn0tt commented Dec 13, 2024

This comment has been minimized.

mxschmitt left a comment •

edited

Loading

mxschmitt Dec 14, 2024

Skn0tt Dec 16, 2024

Skn0tt commented Dec 16, 2024

Skn0tt commented Dec 16, 2024 •

edited

Loading

github-actions bot commented Dec 16, 2024

Skn0tt commented Dec 17, 2024

feat: add TAR parser #33975

feat: add TAR parser #33975

Conversation

Skn0tt commented Dec 12, 2024

Copilot AI left a comment

Choose a reason for hiding this comment

This comment has been minimized.

Skn0tt commented Dec 13, 2024

This comment has been minimized.

mxschmitt left a comment • edited Loading

Choose a reason for hiding this comment

mxschmitt Dec 14, 2024

Choose a reason for hiding this comment

Skn0tt Dec 16, 2024

Choose a reason for hiding this comment

Skn0tt commented Dec 16, 2024

Skn0tt commented Dec 16, 2024 • edited Loading

github-actions bot commented Dec 16, 2024

Test results for "tests 1"

Skn0tt commented Dec 17, 2024

mxschmitt left a comment •

edited

Loading

Skn0tt commented Dec 16, 2024 •

edited

Loading