Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix CI test workflow errors in RecoverySigner unit tests #5398

Merged
merged 2 commits into from
Jul 22, 2024

Conversation

urvisavla
Copy link
Contributor

@urvisavla urvisavla commented Jul 19, 2024

PR Checklist

PR Structure

  • This PR has reasonably narrow scope (if not, break it down into smaller PRs).
  • This PR avoids mixing refactoring changes with feature changes (split into two PRs
    otherwise).
  • This PR's title starts with name of package that is most changed in the PR, ex.
    services/friendbot, or all or doc if the changes are broad or impact many
    packages.

Thoroughness

  • This PR adds tests for the most critical parts of the new functionality or fixes.
  • I've updated any docs (developer docs, .md
    files, etc... affected by this change). Take a look in the docs folder for a given service,
    like this one.

Release planning

  • I've updated the relevant CHANGELOG (here for Horizon) if
    needed with deprecations, added features, breaking changes, and DB schema changes.
  • I've decided if this PR requires a new major/minor version according to
    semver, or if it's mainly a patch change. The PR is targeted at the next
    release branch if it's not a patch change.

What

Recently, we've been seeing the following error in the test workflow Received unexpected error: pq: duplicate key value violates unique constraint 'pg_authid_rolname_index' in the RecoverySigner unit test. The first occurrence of this error in recent history was here. After committing the PR that moves Galexie out of experimental, the issue became exacerbated and started failing every time. The RecoverySigner unit test always fails, but not always in the same test, though the error remains the same.

This issue arises because multiple concurrent tests are trying to create the same role, resulting in a race condition. It is unclear why we consistently see this issue now; it may be due to the order in which the tests are executed changing after the relocation of Galexie out of experimental.

The solution is to check for the specific error and if it is a duplicate key violation, we know that the role has been created by another concurrent transaction in which case we ignore the error. Any other type of error during role creation will result in the tests failing.

Why

Resolve CI test workflow errors.

Known limitations

@urvisavla urvisavla force-pushed the fix-race-condition branch 4 times, most recently from 58bd633 to 82eef65 Compare July 21, 2024 06:30
@urvisavla urvisavla changed the title Fix race condition in unit tests Fix CI test workflow errors in RecoverySigner unit tests Jul 22, 2024
@urvisavla urvisavla marked this pull request as ready for review July 22, 2024 18:26
@urvisavla urvisavla merged commit 2674e20 into stellar:master Jul 22, 2024
23 checks passed
@urvisavla urvisavla deleted the fix-race-condition branch July 22, 2024 21:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants