Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add TAR parser #33975

Closed
wants to merge 14 commits into from
Closed

feat: add TAR parser #33975

wants to merge 14 commits into from

Conversation

Skn0tt
Copy link
Member

@Skn0tt Skn0tt commented Dec 12, 2024

Adds a TAR parser + basic test based on Max's draft. We'll use this in the Brotli-downloader, so we don't have to add dependencies.

@Skn0tt Skn0tt requested review from mxschmitt and Copilot December 12, 2024 11:19
@Skn0tt Skn0tt self-assigned this Dec 12, 2024
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 suggestion.

Comments skipped due to low confidence (1)

packages/playwright-core/src/utils/tar.ts:78

  • The chunk method does not handle the case where the fileStream is null properly. Add a check to ensure fileStream is not null before writing to it.
assert(this.fileStream);

packages/playwright-core/src/utils/tar.ts Show resolved Hide resolved

This comment has been minimized.

@Skn0tt
Copy link
Member Author

Skn0tt commented Dec 13, 2024

@mxschmitt i've added another test that uses one of our chromium builds. Looks like the extractor is working well!

This comment has been minimized.

Copy link
Member

@mxschmitt mxschmitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to make the following 3 changes in order to make it work with chrome-mac-arm64.tar.br. Not a huge fan of downloading the entire browser during the test-run - imo its fine to not have a unit test-coverage on it - since when we use it the bots will give us enough confidence / coverage. Lets maybe squeeze it into the actual PR when we implement br/tar file format? Also thinking if we should verify the checksums.

packages/playwright-core/src/utils/tar.ts Show resolved Hide resolved
packages/playwright-core/src/utils/tar.ts Show resolved Hide resolved
packages/playwright-core/src/utils/tar.ts Outdated Show resolved Hide resolved
}

async writeToDisk(outputPath: (path: string) => string) {
const fullPath = outputPath(this.name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is outputPath a callback? Ideally its a string and we should add a check so that users can't write 'outside' of the outputPath.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this a callback instead of a raw string, so that the calling side can implement logic like this:

https://gist.github.com/mxschmitt/0807f729ba68358e58bf5210b4607fdc#file-har-extract-mjs-L71-L76

It seemed important that we can do that, but at the same time it also didn't seem like a property of the TAR extraction, so it can live outside of it now.

@Skn0tt
Copy link
Member Author

Skn0tt commented Dec 16, 2024

Good catch on the "prefix" field. Reading through the spec, this seemed like a legacy field to me so I omitted it initially. Added it back in 2c9bbf7.

@Skn0tt
Copy link
Member Author

Skn0tt commented Dec 16, 2024

Lets maybe squeeze it into the actual PR when we implement br/tar file format?

I'd prefer merging it individually, so the TAR PR doesn't become insanely big.

Not a huge fan of downloading the entire browser during the test-run - imo its fine to not have a unit test-coverage on it - since when we use it the bots will give us enough confidence / coverage.

Makes sense! I'll remove the unit tests before merging then.

Also thinking if we should verify the checksums.

The TAR spec says The chksum field represents the simple sum of all bytes in the header block.. So it wouldn't verify the body correctness, just the headers. I'm torn wether it's worth it to implement that.

Copy link
Contributor

Test results for "tests 1"

10 flaky ⚠️ [firefox-page] › page/page-evaluate.spec.ts:403:3 › should throw for too deep reference chain @firefox-ubuntu-22.04-node18
⚠️ [webkit-library] › library/inspector/cli-codegen-1.spec.ts:147:7 › cli codegen › should make a positioned click on a canvas @webkit-ubuntu-22.04-node18
⚠️ [webkit-library] › library/proxy.spec.ts:93:11 › should proxy local network requests › by default › localhost @webkit-ubuntu-22.04-node18
⚠️ [webkit-library] › library/selector-generator.spec.ts:313:7 › selector generator › should prioritize attributes correctly › role @webkit-ubuntu-22.04-node18
⚠️ [webkit-library] › library/trace-viewer.spec.ts:278:1 › should have correct snapshot size @webkit-ubuntu-22.04-node18
⚠️ [webkit-library] › library/trace-viewer.spec.ts:1350:1 › should remove noscript when javaScriptEnabled is set to true @webkit-ubuntu-22.04-node18
⚠️ [webkit-library] › library/trace-viewer.spec.ts:1474:1 › should not leak recorders @webkit-ubuntu-22.04-node18
⚠️ [webkit-library] › library/trace-viewer.spec.ts:1554:1 › should show only one pointer with multilevel iframes @webkit-ubuntu-22.04-node18
⚠️ [webkit-library] › library/tracing.spec.ts:335:5 › should collect sources @webkit-ubuntu-22.04-node18
⚠️ [webkit-page] › page/page-set-input-files.spec.ts:147:3 › should upload large file @webkit-ubuntu-22.04-node18

37298 passed, 662 skipped
✔️✔️✔️

Merge workflow run.

@Skn0tt
Copy link
Member Author

Skn0tt commented Dec 17, 2024

we discovered some problems of this approach around tars with long filenames, and chose to go with a 3rd-party implementation for the time being.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants