-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: expose error fingerprints as an HTTP header #796
Conversation
This sets a new header in the error rendering middleware which exposes the fingerprint of an error if it has one. This allows us to better debug in production environments - our logs also include this fingerprint so that you can map our generic error pages to the type of error that occurred and where in the code it was thrown. Some considerations: * If an app is behind a CDN and error pages are cached, then we will need to vary on the `x-error-fingerprint` header in order for this to be useful - otherwise the first fingerprint will get cached * Not all apps will get this functionality at the same time. They'll need to either start using the error rendering middleware or bump the major version, because this is landing as part of a breaking change
ooh - sweet idea. One question - why the first 2 lines? How confident are you that this is sufficient information to generate unique enough identifiers. Also, does it need to be a hash; could it not just be a randomly generated unique identifier? |
Maybe missing a bit of context: this is not a unique identifier for the specific instance of an error that occurred, it's also designed to help us group errors of the same type so that engineers can identify the parts of their codebase where the most errors occur. So:
|
suggestion: Would you be happy to do this without the |
Yeah I can do that. I'd rather go unprefixed then as Using |
This way of defining headers is deprecated: https://datatracker.ietf.org/doc/html/rfc6648 Co-Authored-By: Alex Muller <[email protected]> Co-Authored-By: Cynthia Mbulu <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice one Rowan! Exciting times 😸
298622c
to
fbcfc3a
Compare
This sets a new header in the error rendering middleware which exposes the fingerprint of an error if it has one. This allows us to better debug in production environments - our logs also include this fingerprint so that you can map our generic error pages to the type of error that occurred and where in the code it was thrown.
How it works
Complete: the
serializeError
method, used in our logging and error rendering, now adds afingerprint
property to an error object. This fingerprint is a hash of the first two lines of the error stack trace, meaning that the fingerprint can locate an error in our codebase without giving away error information to end-users or attackersComplete: the latest version Reliability Kit error logging middleware already logs this alongside the rest of the error. This allows us to easily group errors that occur in the same part of the source code
The change in this PR: we now expose this fingerprint as an
error-fingerprint
HTTP header in the error rendering middleware. This is a safe way of exposing error information to the end user and we can match it to the error in our logs to extract the full error detailsFuture: once this header is present on our error pages, we'll be able to use it to aid debugging production issues. We have a couple of ideas:
Expose the error fingerprint in the UI: we already do this with "customer care codes", giving customers a code to quote to Customer Care which then speeds up their diagnosis of which team to contact. If we expose the error fingerprint then the engineer who picks up the bug report will be able to find the exact error being thrown via our logs
Read the error fingerprint in a browser extension: if the header is present in the browser then we could use it in an internal-facing browser extension. We could allow Operations and Customer Care to navigate directly from an error page to a Splunk search for the error fingerprint, reducing the time required to identify the responsible system
Some considerations
If an app is behind a CDN and error pages are cached, then we will need to vary on the
error-fingerprint
header in order for this to be useful - otherwise the first fingerprint will get cachedNot all apps will get this functionality at the same time. They'll need to either start using the error rendering middleware or bump the major version, because this is landing as part of a breaking change