You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
'Error fetching spdx license data for MetadataListMetadataItem',
error: getErrorMessage(err),
});
},
Fetching this data on the browser is probably inefficient and more prone to error because of the user’s environment. We can move this fetch to the server side to improve the reliability of this API call. If this doesn’t reduce the amount of errors occurring, we can look into reducing the log level of this message.
According to the docs, this error occurs if the route transition is cancelled or if an error is thrown, but the code above doesn’t check for this when logging the error message. We can refactor the code to use a different log level depending on if the user cancelled the transition or not:
Ideally this should reduce the amount of actual errors we encounter, but if not, we can look into filtering out this error from the logs metric filter if it’s something we can’t easily fix.
Uncaught Error Alarms
These are alarms that are not handled within a try / catch block. Currently RUM has reported the following errors:
CWR: Failed to retrieve credentials from STS: TypeError: Failed to fetch
This error occurs when a network error occurs while fetching credentials from AWS STS. The stacktrace for this message looks like:
Error: CWR: Failed to retrieve credentials from STS: TypeError: Failed to fetch
at nS.<anonymous> (www.napari-hub.org/_next/static/chunks/pages/_app-ab7d999ffe90ca01.js:165:375226)
at www.napari-hub.org/_next/static/chunks/pages/_app-ab7d999ffe90ca01.js:165:373992
at Object.throw (www.napari-hub.org/_next/static/chunks/pages/_app-ab7d999ffe90ca01.js:165:374097)
at s (www.napari-hub.org/_next/static/chunks/pages/_app-ab7d999ffe90ca01.js:165:375420)
Unfortunately we can't really fix this error since we can't control user network conditions. Instead, we can try filtering this event from being tracked by the alarm.
To do this, we will need to refactor the alarm infrastructure to:
Export RUM events to a log stream
Create a logs metric filter that filters out STS fetch errors
Updated frontend alarm to use data from logs metric filter
Error details: CWR: Failed to retrieve Cognito OpenId token: TypeError: Failed to fetch
Similar to the above error, this is out of our control due to user network conditions. We can remove this from the frontend alarm by ignoring this specific error message.
The provided href (/plugins/[name]) value is missing query values
According to the docs, this error occurs when the UI tries to open a URL that does not have the provided variable in the pathname.
This error is a bit complex to debug because it happens intermittently and is not easy to reproduce. The frequency appears to be 1-2 instance per week:
The plugin page also does not have links to itself or plugin pages, so it seems technically impossible for this error to occur.
One thing we can try is updating all references to /plugins/[name] to check that name is defined before creating a link or navigating to a route.
If this does not reduce the errors, we could reduce the log level since this type of error doesn't have a huge impact on the functionality of the page. It's possible this error could be a result of an intermittent loading state since some of the errors happen in the loading state for the plugin page.
Script error
These are unknown errors that happen during JavaScript execution that seemingly only happen on Desktop Safari browsers:
This error may occur when the frontend tries to load JavaScript from another domain. Based on this article, we can possibly fix this by updating references to external JavaScript to include the crossorigin property in the <script> tag.
The only reference to this is the script we use for hub spot:
If this does not reduce the errors, we can look into filtering out this message for this specific error.
Request aborted
This error occurs when a request is cancelled which may happen if the user navigates away from a page with an in-progress request, so it should be safe to filter out.
ResizeObserver loop completed with undelivered notifications.
This error occurs when ResizeObserver is trying to notify subscribers of a recent resize. This error may occur if the users page resizes during a notification. Unfortunately we can't control this because of the variety of differences in the user's environment like viewport and browser, so this is something we can look into filtering out.
recently got some 400 errors today related to a user somehow accessing the plugins page using the template variable [name]:
this would mean they accessed /plugins/[name] somehow. overall this isn't necessarily an error we have to worry about since it's a client error, so we can filter these out by reducing the log level to warning. I've added the task Assign lower log level to 4xx errors to capture this 🫡
Caught Error Alarms
These are errors that are caught and logged in the frontend. The errors for these logs are can be surfaced from CloudWatch Logs.
Error fetching spdx license data
This log can be found in CloudWatch when filtering using the following query:
This is related to error logs that happens when fetching the SPDX license data on the browser throws an error:
napari-hub/frontend/src/components/MetadataList/MetadataListMetadataItem.tsx
Lines 126 to 132 in 41ae700
Fetching this data on the browser is probably inefficient and more prone to error because of the user’s environment. We can move this fetch to the server side to improve the reliability of this API call. If this doesn’t reduce the amount of errors occurring, we can look into reducing the log level of this message.
Error loading route
This log can be found in CloudWatch when filtering using the following query:
This is related to some code for logging when an error occurs while a page is transitioning:
napari-hub/frontend/src/hooks/usePageTransitions.ts
Lines 66 to 69 in 41ae700
According to the docs, this error occurs if the route transition is cancelled or if an error is thrown, but the code above doesn’t check for this when logging the error message. We can refactor the code to use a different log level depending on if the user cancelled the transition or not:
Ideally this should reduce the amount of actual errors we encounter, but if not, we can look into filtering out this error from the logs metric filter if it’s something we can’t easily fix.
Uncaught Error Alarms
These are alarms that are not handled within a try / catch block. Currently RUM has reported the following errors:
CWR: Failed to retrieve credentials from STS: TypeError: Failed to fetch
This error occurs when a network error occurs while fetching credentials from AWS STS. The stacktrace for this message looks like:
Unfortunately we can't really fix this error since we can't control user network conditions. Instead, we can try filtering this event from being tracked by the alarm.
To do this, we will need to refactor the alarm infrastructure to:
Error details: CWR: Failed to retrieve Cognito OpenId token: TypeError: Failed to fetch
Similar to the above error, this is out of our control due to user network conditions. We can remove this from the frontend alarm by ignoring this specific error message.
The provided
href
(/plugins/[name]) value is missing query valuesAccording to the docs, this error occurs when the UI tries to open a URL that does not have the provided variable in the pathname.
This error is a bit complex to debug because it happens intermittently and is not easy to reproduce. The frequency appears to be 1-2 instance per week:
The plugin page also does not have links to itself or plugin pages, so it seems technically impossible for this error to occur.
One thing we can try is updating all references to
/plugins/[name]
to check thatname
is defined before creating a link or navigating to a route.If this does not reduce the errors, we could reduce the log level since this type of error doesn't have a huge impact on the functionality of the page. It's possible this error could be a result of an intermittent loading state since some of the errors happen in the loading state for the plugin page.
Script error
These are unknown errors that happen during JavaScript execution that seemingly only happen on Desktop Safari browsers:
This error may occur when the frontend tries to load JavaScript from another domain. Based on this article, we can possibly fix this by updating references to external JavaScript to include the
crossorigin
property in the<script>
tag.The only reference to this is the script we use for hub spot:
napari-hub/frontend/src/pages/_app.tsx
Lines 88 to 93 in 41ae700
If this does not reduce the errors, we can look into filtering out this message for this specific error.
Request aborted
This error occurs when a request is cancelled which may happen if the user navigates away from a page with an in-progress request, so it should be safe to filter out.
ResizeObserver loop completed with undelivered notifications.
This error occurs when
ResizeObserver
is trying to notify subscribers of a recent resize. This error may occur if the users page resizes during a notification. Unfortunately we can't control this because of the variety of differences in the user's environment like viewport and browser, so this is something we can look into filtering out.Action Items
name
is defined for all references to/plugins/[name]
The text was updated successfully, but these errors were encountered: