Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Notebook creation sometimes doesn't show notebook after indicating successful creation and redirect #1822

Open
Swiddis opened this issue May 6, 2024 · 3 comments
Labels
bug Something isn't working cannot-repro Unable to reproduce the issue. help wanted Extra attention is needed

Comments

@Swiddis
Copy link
Collaborator

Swiddis commented May 6, 2024

What is the bug?
This has been a longstanding issue in the notebook-reporting integration tests in FTR. Sometimes, created notebooks show no title due to an underlying 500 for the querying the notebook ID, which has been causing integration tests for notebooks to be flaky. We've tried to fix the integration tests repeatedly but it seems like notebooks itself has flaky functionality on creation.

How can one reproduce the bug?
Steps to reproduce the behavior are uncertain, but roughly:

  1. Start a cluster with at least 2 opensearch nodes that has observability and reporting installed, and possibly a low index refresh rate.
  2. Create a new notebook.
  3. Sometimes, the notebook will be shown without a title. See e.g. the recording here.

What is the expected behavior?
If notebooks indicates a successful notebook creation, the redirected page should always show a valid notebook, regardless of refresh state or other race conditions.

What is your host/environment?

  • OS: An RPM x64 linux distro (such as Fedora), as seen in this comment with a by-platform failure breakdown
  • Version: Seen so far on OSD 2.10, 2.12, 2.13, and 2.15.
  • Plugins: Observability, Reporting

Do you have any screenshots?
Functional Test Repository failure screenshot

Do you have any additional context?
See also: opensearch-project/opensearch-dashboards-functional-test#1270.

@Swiddis
Copy link
Collaborator Author

Swiddis commented Jun 5, 2024

This was resolved by revising the tests, but the underlying cause is still unknown.

@Swiddis Swiddis closed this as completed Jun 5, 2024
@Swiddis
Copy link
Collaborator Author

Swiddis commented Jun 14, 2024

Came back in #1905

@Swiddis Swiddis reopened this Jun 14, 2024
@Swiddis Swiddis added cannot-repro Unable to reproduce the issue. help wanted Extra attention is needed and removed untriaged labels Jun 14, 2024
@Swiddis
Copy link
Collaborator Author

Swiddis commented Jun 14, 2024

Looks like this behavior is from a lack of 404 handling -- the 500 and empty page comes up if you put in any unrecognized ID in the URL. Creation logic probably has a race condition causing the error in testing. The fix needs 2 steps:

  • Add an actual 404 page so this error is easier to debug in the future
  • Ensure notebook creation never hits this situation anyways

But I've looked through the relevant source and it doesn't look that simple, because the create logic does wait for the notebook API to return correctly before returning a response and redirecting. This indicates it might be that the indices are out of sync on the back-end, but one of the previous attempts to fix the test flakiness was to force a manual index and page refresh after notebook creation and it still failed. I also can't provoke the issue by introducing delays at different steps in the creation process. The issue also historically got worse when I had more refresh logic, instead of better -- the best results came from never refreshing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cannot-repro Unable to reproduce the issue. help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant