Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: return a 400 for malformed URIs #64

Merged
merged 1 commit into from
Nov 15, 2023
Merged

Conversation

flakey5
Copy link
Member

@flakey5 flakey5 commented Nov 12, 2023

@flakey5 flakey5 requested a review from a team as a code owner November 12, 2023 20:34
@ovflowd
Copy link
Member

ovflowd commented Nov 12, 2023

Quick question: Why does it crash there and not here? https://github.com/nodejs/release-cloudflare-worker/blob/main/src/handlers/get.ts#L29

src/handlers/get.ts Outdated Show resolved Hide resolved
@flakey5
Copy link
Member Author

flakey5 commented Nov 13, 2023

Quick question: Why does it crash there and not here? https://github.com/nodejs/release-cloudflare-worker/blob/main/src/handlers/get.ts#L29

Honestly I don't know, my best guess is that calling new URL doesn't validate the path and instead focuses just on the structure of the url itself?

@ovflowd
Copy link
Member

ovflowd commented Nov 13, 2023

Quick question: Why does it crash there and not here? https://github.com/nodejs/release-cloudflare-worker/blob/main/src/handlers/get.ts#L29

Honestly I don't know, my best guess is that calling new URL doesn't validate the path and instead focuses just on the structure of the url itself?

Could also be lazy construction/validation

@targos
Copy link
Member

targos commented Nov 13, 2023

I'm not sure you're supposed to call decodeURIComponent on the pathname. The URL API already decodes automatically what should be decoded.

@ovflowd
Copy link
Member

ovflowd commented Nov 13, 2023

I'm not sure you're supposed to call decodeURIComponent on the pathname. The URL API already decodes automatically what should be decoded.

Right, I think it is indeed unnecessary as URL already does the sanitizations that are needed on the pathname property.

Off-topic: @flakey5 I believe

const urlToBucketPathMap: Record<string, string> = {
dist: DIST_PATH_PREFIX + (url.pathname.substring('/dist'.length) || '/'),
download:
DOWNLOAD_PATH_PREFIX +
(url.pathname.substring('/download'.length) || '/'),
docs: DOCS_PATH_PREFIX + (url.pathname.substring('/docs'.length) || '/'),
api: API_PATH_PREFIX + (url.pathname.substring('/api'.length) || '/'),
metrics: url.pathname.substring(1), // substring to cut off the /
};
things like these, should be in the constants file. We should have all sorts of mappings like these in dedicated constants and explain them. The code is already becoming complex enough that often it's hard to find our way in some places.

Like I didn'te ven realise mapUrlPathToBucketPath was doing so much.

@flakey5
Copy link
Member Author

flakey5 commented Nov 13, 2023

The call to decodeURIComponent isn't necessarily to validate the path, it's to decode the path from say /a%20b%20c to /a b c. From my testing in both workers and node the URL object seemingly doesn't decode the path for this automatically.

We should have all sorts of mappings like these in dedicated constants and explain them

Yeah that sounds like a good idea as well

@targos
Copy link
Member

targos commented Nov 13, 2023

The call to decodeURIComponent isn't necessarily to validate the path, it's to decode the path from say /a%20b%20c to /a b c. From my testing in both workers and node the URL object seemingly doesn't decode the path for this automatically.

Yes, but why do we need it? There are no legitimate URLs that need percent encoding in the pathname.

@targos
Copy link
Member

targos commented Nov 13, 2023

Another approach could be to call decodeURI on the entire URL before passing it to the URL constructor?

@ovflowd
Copy link
Member

ovflowd commented Nov 13, 2023

Another approach could be to call decodeURI on the entire URL before passing it to the URL constructor?

That sounds the best approach here, and to catch the broken URL directly when calling `new UR. (inside the parseURL) (with try/catch) so we return undefined -> BAD_REQUEST directly from there

@flakey5
Copy link
Member Author

flakey5 commented Nov 15, 2023

If we don't need to validate the uri I think we can just go without it, if it becomes a need later on (e.g. file names with spaces) we can add it back

@flakey5 flakey5 force-pushed the flakey5/sentry/635db3eb branch 2 times, most recently from bd503e3 to 80d53d6 Compare November 15, 2023 18:09
@flakey5 flakey5 requested a review from ovflowd November 15, 2023 18:09
@ovflowd
Copy link
Member

ovflowd commented Nov 15, 2023

If we don't need to validate the uri I think we can just go without it, if it becomes a need later on (e.g. file names with spaces) we can add it back

I think the idea was to validate the url within parseUrl when we create new URL and have try/catch there to then return undefined.

But I guess we can also simply do it like this for now, and be reactive if errors arise.

@ovflowd ovflowd merged commit d75c442 into main Nov 15, 2023
3 checks passed
@ovflowd ovflowd deleted the flakey5/sentry/635db3eb branch November 15, 2023 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants