Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that the response-origin of range requests match the full request (issue 12744) #19028

Merged
merged 1 commit into from
Nov 24, 2024

Conversation

Snuffleupagus
Copy link
Collaborator

@Snuffleupagus Snuffleupagus commented Nov 12, 2024

The following cases are excluded in the patch:

  • The Firefox PDF Viewer, since it has been fixed on the platform side already; please see https://bugzilla.mozilla.org/show_bug.cgi?id=1683940

  • The PDFNodeStream-implementation, used in Node.js environments, since after recent changes that code only supports file://-URLs.

Also updates the read-methods, on both PDFNetworkStreamFullRequestReader and PDFNetworkStreamRangeRequestReader, to await the headers before returning any data similarly to the implementation in src/display/fetch_stream.js.

Note: The relevant unit-tests are updated to await the headersReady Promise before dispatching range requests, since that's consistent with the actual usage in the src/-folder.

Fixes #12744

@Snuffleupagus

This comment was marked as outdated.

@Snuffleupagus Snuffleupagus marked this pull request as draft November 12, 2024 13:53
@Snuffleupagus

This comment was marked as outdated.

@Snuffleupagus

This comment was marked as outdated.

@Snuffleupagus Snuffleupagus marked this pull request as ready for review November 12, 2024 14:57
@Snuffleupagus

This comment was marked as outdated.

src/display/fetch_stream.js Outdated Show resolved Hide resolved
src/display/network_utils.js Show resolved Hide resolved
src/display/network.js Outdated Show resolved Hide resolved
@Snuffleupagus Snuffleupagus requested review from Rob--W and removed request for calixteman November 17, 2024 12:03
@mozilla mozilla deleted a comment from moz-tools-bot Nov 17, 2024
@mozilla mozilla deleted a comment from moz-tools-bot Nov 17, 2024
@mozilla mozilla deleted a comment from moz-tools-bot Nov 17, 2024
@mozilla mozilla deleted a comment from moz-tools-bot Nov 17, 2024
@mozilla mozilla deleted a comment from moz-tools-bot Nov 17, 2024
@mozilla mozilla deleted a comment from moz-tools-bot Nov 17, 2024
@mozilla mozilla deleted a comment from moz-tools-bot Nov 17, 2024
@mozilla mozilla deleted a comment from moz-tools-bot Nov 17, 2024
@mozilla mozilla deleted a comment from moz-tools-bot Nov 17, 2024
@mozilla mozilla deleted a comment from moz-tools-bot Nov 17, 2024
@mozilla mozilla deleted a comment from moz-tools-bot Nov 17, 2024
@mozilla mozilla deleted a comment from moz-tools-bot Nov 17, 2024
@mozilla mozilla deleted a comment from moz-tools-bot Nov 17, 2024
@mozilla mozilla deleted a comment from moz-tools-bot Nov 17, 2024
@mozilla mozilla deleted a comment from moz-tools-bot Nov 17, 2024
@Snuffleupagus Snuffleupagus force-pushed the issue-12744 branch 2 times, most recently from 2703c6d to a04f057 Compare November 20, 2024 11:48
@Snuffleupagus
Copy link
Collaborator Author

/botio test

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.193.163.58:8877/b74c2a0c1d65210/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.241.84.105:8877/a250d0ad1d48219/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Failed

Full output at http://54.241.84.105:8877/a250d0ad1d48219/output.txt

Total script time: 30.73 mins

  • Unit tests: Passed
  • Integration Tests: Passed
  • Regression tests: FAILED
  different ref/snapshot: 17
  different first/second rendering: 2

Image differences available at: http://54.241.84.105:8877/a250d0ad1d48219/reftest-analyzer.html#web=eq.log

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Failed

Full output at http://54.193.163.58:8877/b74c2a0c1d65210/output.txt

Total script time: 46.16 mins

  • Unit tests: Passed
  • Integration Tests: FAILED
  • Regression tests: Passed

@Snuffleupagus
Copy link
Collaborator Author

I created a PR with unit tests at #19074.

Thank you!

Your patch currently stores _storedError, which is used at the start of _read(). But if _read() is called immediately, then initially there is no _storedError and it will just wait for the response and pass that through. Please fix that.

Hopefully that's fixed now, by using a similar pattern of waiting for the headers as in src/display/fetch_stream.js.

@Snuffleupagus Snuffleupagus requested a review from Rob--W November 20, 2024 12:44
@Snuffleupagus Snuffleupagus marked this pull request as ready for review November 20, 2024 12:45
Copy link
Member

@Rob--W Rob--W left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The functional behavior of the patch looks good to me. I added a comment below with a suggestion to simplify, basically getting rid of the new _readCapability promise in favor of rejecting existing request promises. Consider this patch approved and ready to merge with that suggestion applied.

A note to a comment in the PR text / commit message:

Note: The relevant unit-tests are updated to await the headersReady Promise before dispatching range requests, since that's consistent with the actual usage in the src/-folder.

Note: this test fixup was needed because this patch introduces a new requirement: getRangeReader can only be invoked on streams that were preceded by a full request, at least until the point that the headers were received. Independently of whether the current code satisfies that requirement, we should document this expectation.

I confirmed your claim that the current code in src/ awaits headersReady before dispatching range requests, with the entry point being at getPdfManager in worker.js - our implementation does this because in order to know whether we need to switch to range requests, we first need to know whether the server supports range requests. Currently the only supported way to do so is to send another request.

As an example of a differing implementation: the Chrome extension has already received the headers at the time that the viewer is opened, so in theory it could forward the headers and response origin to the viewer so that it can immediately try switch to range requests faster (without doing the initial full request). I currently have no plans to implement that in the Chrome extension due to added complexity though.

src/display/network.js Outdated Show resolved Hide resolved
Rob--W added a commit to Rob--W/pdf.js that referenced this pull request Nov 23, 2024
Rob--W added a commit to Rob--W/pdf.js that referenced this pull request Nov 23, 2024
…est (issue 12744)

The following cases are excluded in the patch:
 - The Firefox PDF Viewer, since it has been fixed on the platform side already; please see https://bugzilla.mozilla.org/show_bug.cgi?id=1683940

 - The `PDFNodeStream`-implementation, used in Node.js environments, since after recent changes that code only supports `file://`-URLs.

Also updates the `PDFNetworkStreamFullRequestReader.read`-method to await the headers before returning any data, similar to the implementation in `src/display/fetch_stream.js`.

*Note:* The relevant unit-tests are updated to await the `headersReady` Promise before dispatching range requests, since that's consistent with the actual usage in the `src/`-folder.
@Snuffleupagus
Copy link
Collaborator Author

Consider this patch approved and ready to merge with that suggestion applied.

Fixed, thank you.

getRangeReader can only be invoked on streams that were preceded by a full request, at least until the point that the headers were received. Independently of whether the current code satisfies that requirement, we should document this expectation.

I've added a comment in src/interfaces.js, since that's what all stream implementations are based upon.

As an example of a differing implementation: the Chrome extension has already received the headers at the time that the viewer is opened, so in theory it could forward the headers and response origin to the viewer so that it can immediately try switch to range requests faster (without doing the initial full request). I currently have no plans to implement that in the Chrome extension due to added complexity though.

So basically, the Chrome extension could be changed to instead use PDFDataRangeTransport (similar to the Firefox PDF Viewer).
Although, the part about avoiding additional complexity makes sense.

@Snuffleupagus
Copy link
Collaborator Author

/botio unittest

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Received

Command cmd_unittest from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.241.84.105:8877/401c70deab6d163/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Received

Command cmd_unittest from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.193.163.58:8877/06175bf9956bd84/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Success

Full output at http://54.241.84.105:8877/401c70deab6d163/output.txt

Total script time: 2.59 mins

  • Unit Tests: Passed

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Success

Full output at http://54.193.163.58:8877/06175bf9956bd84/output.txt

Total script time: 6.81 mins

  • Unit Tests: Passed

@Snuffleupagus Snuffleupagus merged commit f911635 into mozilla:master Nov 24, 2024
9 checks passed
@Snuffleupagus Snuffleupagus deleted the issue-12744 branch November 24, 2024 09:37
Rob--W added a commit to Rob--W/pdf.js that referenced this pull request Nov 24, 2024
Rob--W added a commit to Rob--W/pdf.js that referenced this pull request Nov 24, 2024
Rob--W added a commit to Rob--W/pdf.js that referenced this pull request Nov 24, 2024
Rob--W added a commit to Rob--W/pdf.js that referenced this pull request Nov 25, 2024
Rob--W added a commit to Rob--W/pdf.js that referenced this pull request Nov 27, 2024
Rob--W added a commit to Rob--W/pdf.js that referenced this pull request Nov 28, 2024
Rob--W added a commit to Rob--W/pdf.js that referenced this pull request Dec 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Prevent cross-origin information leakage via range requests
3 participants