Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(node): Ensure request bodies are reliably captured for http requests #13746

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from

Conversation

mydea
Copy link
Member

@mydea mydea commented Sep 23, 2024

This PR started out as trying to fix capturing request bodies for Koa.

Some investigation later, we found out that the fundamental problem was that we relied on the request body being on request.body, which is non-standard and thus does not necessarily works. It seems that in express this works because it under the hood writes the body there, but this is non-standard and rather undefined behavior. For other frameworks (e.g. Koa and probably more) this did not work, the body was not on the request and thus never captured. We also had no test coverage for this overall.

This PR ended up doing a few things:

  • Add tests for this for express and koa
  • Streamline types for sdkProcessingMetadata - this used to be any, which lead to any usage of this not really being typed at all. I added proper types for this now.
  • Generic extraction of the http request body in the http instrumentation - this should now work for any node framework

Most importantly, I opted to not force this into the existing, rather complicated and hard to follow request data integration flow. This used to take an IsomorphicRequest and then did a bunch of conversion etc.

Since now in Node, we always have the same, proper http request (for any framework, because this always goes through http instrumentation), we can actually streamline this and normalize this properly at the time where we set this.

So with this PR, we set a normalizedRequest which already has the url, headers etc. set in a way that we need it/it makes sense.
Additionally, the parsed & stringified request body will be set on this too.

If this normalized request is set in sdkProcessingMetadata, we will use it as source of truth instead of the plain request. (Note that we still need the plain request for some auxiliary data that is non-standard, e.g. request.user).

For the body parsing itself, we monkey-patch req.on('data'). this way, we ensure to not add more handlers than a user has, and we only extract the body if the user is extracting it anyhow, ensuring we do not alter behavior.

Closes #13722

@mydea mydea self-assigned this Sep 23, 2024
Copy link
Contributor

github-actions bot commented Sep 23, 2024

size-limit report 📦

⚠️ Warning: Base artifact is not the latest one, because the latest workflow run is not done yet. This may lead to incorrect results. Try to re-run all tests to get up to date results.

Path Size % Change Change
@sentry/browser 22.73 KB - -
@sentry/browser - with treeshaking flags 21.53 KB - -
@sentry/browser (incl. Tracing) 34.97 KB - -
@sentry/browser (incl. Tracing, Replay) 71.5 KB - -
@sentry/browser (incl. Tracing, Replay) - with treeshaking flags 61.91 KB - -
@sentry/browser (incl. Tracing, Replay with Canvas) 75.82 KB - -
@sentry/browser (incl. Tracing, Replay, Feedback) 88.59 KB - -
@sentry/browser (incl. Tracing, Replay, Feedback, metrics) 90.46 KB - -
@sentry/browser (incl. metrics) 27 KB - -
@sentry/browser (incl. Feedback) 39.87 KB - -
@sentry/browser (incl. sendFeedback) 27.38 KB - -
@sentry/browser (incl. FeedbackAsync) 32.17 KB - -
@sentry/react 25.49 KB - -
@sentry/react (incl. Tracing) 37.94 KB - -
@sentry/vue 26.9 KB - -
@sentry/vue (incl. Tracing) 36.86 KB - -
@sentry/svelte 22.87 KB - -
CDN Bundle 24.05 KB - -
CDN Bundle (incl. Tracing) 36.76 KB - -
CDN Bundle (incl. Tracing, Replay) 71.25 KB - -
CDN Bundle (incl. Tracing, Replay, Feedback) 76.57 KB - -
CDN Bundle - uncompressed 70.53 KB - -
CDN Bundle (incl. Tracing) - uncompressed 109.04 KB - -
CDN Bundle (incl. Tracing, Replay) - uncompressed 221.01 KB - -
CDN Bundle (incl. Tracing, Replay, Feedback) - uncompressed 234.23 KB - -
@sentry/nextjs (client) 37.91 KB - -
@sentry/sveltekit (client) 35.56 KB - -
@sentry/node 125.23 KB +0.6% +753 B 🔺
@sentry/node - without tracing 94.38 KB +0.8% +767 B 🔺
@sentry/aws-serverless 104 KB +0.68% +718 B 🔺

View base workflow run

Copy link

codecov bot commented Sep 23, 2024

❌ 6 Tests Failed:

Tests completed Failed Passed Skipped
624 6 618 33
View the top 3 failed tests by shortest run time
errors.test.ts Sends graphql exception to Sentry
Stack Traces | 0.056s run time
errors.test.ts:73:5 Sends graphql exception to Sentry
errors.test.ts Sends unexpected exception to Sentry if thrown in module that was registered before Sentry
Stack Traces | 0.072s run time
errors.test.ts:60:5 Sends unexpected exception to Sentry if thrown in module that was registered before Sentry
errors.test.ts Sends unexpected exception to Sentry if thrown in module with local filter
Stack Traces | 0.109s run time
errors.test.ts:32:5 Sends unexpected exception to Sentry if thrown in module with local filter

To view individual test run time comparison to the main branch, go to the Test Analytics Dashboard

Copy link
Member

@lforst lforst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really cool change. I think we can clean up the flow a bit and I also hope that the patching doesn't blow up in any unexpected ways.

// This is non-standard, but may be set on e.g. Next.js or Express requests
const cookies = (req as PolymorphicRequest).cookies;

const normalizedRequest: Request = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

l/m: I have been burned in the past because Request is a native type. I suggest we rename this to smth like NormalizedRequest or ProcessingMetadataRequest.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this type already exists, it is not new 😬 but I do agree, but we probably need to alias it then and keep the old name around (deprecated)...?

@@ -148,7 +149,27 @@ export const instrumentHttp = Object.assign(
const isolationScope = (scopes.isolationScope || getIsolationScope()).clone();
const scope = scopes.scope || getCurrentScope();

const headers = req.headers;
const host = headers.host || '<no host>';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

l: I haven't put too much thought into this but is <no host> a good default?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah no idea, just kept this around ^^

packages/node/src/integrations/http.ts Outdated Show resolved Hide resolved
@@ -101,7 +101,7 @@ export type ProfileItem = BaseEnvelopeItem<ProfileItemHeaders, Profile>;
export type ProfileChunkItem = BaseEnvelopeItem<ProfileChunkItemHeaders, ProfileChunk>;
export type SpanItem = BaseEnvelopeItem<SpanItemHeaders, Partial<SpanJSON>>;

export type EventEnvelopeHeaders = { event_id: string; sent_at: string; trace?: DynamicSamplingContext };
export type EventEnvelopeHeaders = { event_id: string; sent_at: string; trace?: Partial<DynamicSamplingContext> };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

h: What is this about?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, forgot to mention this here: this was just not correctly annotated before, really. Since we did not type dynamicSamplingContext on the sdkProcessingMetadata before, this worked by accident. but because this is really Partial<DSC> (based on what we set on it, which is also partial), it makes this necessary to be correct.

We could also choose to leave this and cast it in the place where we set this, but really that means lying to ourselves because we don't know it is complete. 🤷 no strong feelings though, if we feel this is a bad type change I can also cast it.

packages/utils/src/requestdata.ts Outdated Show resolved Hide resolved
Comment on lines 76 to 95
const { sdkProcessingMetadata = {} } = event;
const req = sdkProcessingMetadata.request;
const { request, normalizedRequest } = sdkProcessingMetadata;

if (!req) {
return event;
const addRequestDataOptions = convertReqDataIntegrationOptsToAddReqDataOpts(_options);

// If this is set, it takes precedence over the plain request object
if (normalizedRequest) {
// Some other data is not available in standard HTTP requests, but can sometimes be augmented by e.g. Express or Next.js
const ipAddress = request ? request.ip || (request.socket && request.socket.remoteAddress) : undefined;
const user = request ? request.user : undefined;

return addNormalizedRequestDataToEvent(event, normalizedRequest, { ipAddress, user }, addRequestDataOptions);
}

const addRequestDataOptions = convertReqDataIntegrationOptsToAddReqDataOpts(_options);
if (!request) {
return event;
}

return addRequestDataToEvent(event, req, addRequestDataOptions);
return addRequestDataToEvent(event, request, addRequestDataOptions);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would mayyyybee do it like this?:

Suggested change
const { sdkProcessingMetadata = {} } = event;
const req = sdkProcessingMetadata.request;
const { request, normalizedRequest } = sdkProcessingMetadata;
if (!req) {
return event;
const addRequestDataOptions = convertReqDataIntegrationOptsToAddReqDataOpts(_options);
// If this is set, it takes precedence over the plain request object
if (normalizedRequest) {
// Some other data is not available in standard HTTP requests, but can sometimes be augmented by e.g. Express or Next.js
const ipAddress = request ? request.ip || (request.socket && request.socket.remoteAddress) : undefined;
const user = request ? request.user : undefined;
return addNormalizedRequestDataToEvent(event, normalizedRequest, { ipAddress, user }, addRequestDataOptions);
}
const addRequestDataOptions = convertReqDataIntegrationOptsToAddReqDataOpts(_options);
if (!request) {
return event;
}
return addRequestDataToEvent(event, req, addRequestDataOptions);
return addRequestDataToEvent(event, request, addRequestDataOptions);
const { sdkProcessingMetadata = {} } = event;
const { request, normalizedRequest } = sdkProcessingMetadata;
const addRequestDataOptions = convertReqDataIntegrationOptsToAddReqDataOpts(_options);
if (request) {
addRequestDataToEvent(event, request, addRequestDataOptions);
}
// If this is set, it takes precedence over the plain request object
if (normalizedRequest) {
addNormalizedRequestDataToEvent(event, normalizedRequest, addRequestDataOptions);
}
return event;

So that

  • We always use all of the data.
  • For now we can just use the old logic from addRequestDataToEvent to extract the "non-standard" fields on the request. Removing the need for the additionalData arg which I admittedly find a bit offputting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Sorry, took some time to come back to this 😅 )

I guess my main problem with this is that we do all the work of parsing the "old" request etc. even though we don't need it? I do feel like it would be easier to understand to have a single function be responsible for it - otherwise, we end up in the state of "why is this field set here? let's look through all of the old code to find if it sets it somewhere, then look over the new code, ...".

By not calling the old method anymore it makes it easier to follow the "new, happy" path imho 🤔 though I do agree that the additionalData part is not great...

packages/node/src/integrations/http.ts Outdated Show resolved Hide resolved
@maximedupre
Copy link

🥹

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Koa request body is not captured for events
3 participants