Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ISSUE] TEST-Check-Connectors-webhook Integration Test Is Flaky #2349

Open
slolatte opened this issue Sep 13, 2024 · 3 comments
Open

[ISSUE] TEST-Check-Connectors-webhook Integration Test Is Flaky #2349

slolatte opened this issue Sep 13, 2024 · 3 comments
Labels
kind/issue Unidentified issue, it could be a bug, misconfig, or anything in between platform/aws Issues related to AWS platform/gcp Issues related to GCP

Comments

@slolatte
Copy link

Describe the issue:

I've noticed that the integration test 'TEST-Check-Connectors-webhook' is being quite flaky, often causing the helm chart setup to fail when triggered via GHA. A workflow rerun usually fixes this issue.

Actual behavior:

An example of this can be seen in this test run here.

Expected behavior:

I would expect the integration to be more robust and suggest it be refactored.

How to reproduce:

Context - please see Slack thread.

Logs:

Environment:

Please note: Without the following info, it's hard to resolve the issue and probably it will be closed.

  • Platform:
  • Helm CLI version:
  • Chart version:
  • Values file:
@slolatte slolatte added the kind/issue Unidentified issue, it could be a bug, misconfig, or anything in between label Sep 13, 2024
@github-actions github-actions bot added platform/aws Issues related to AWS platform/gcp Issues related to GCP labels Sep 13, 2024
@hamza-m-masood
Copy link
Contributor

I think this is because the connector pod sometimes fails to start. I normally see this type of error when the connector pod hangs:

2024-09-16T09:45:51.506Z  WARN 1 --- [pool-2-thread-9] i.c.z.client.impl.ZeebeCallCredentials   : The request's security level does not guarantee that the credentials will be confidential.

@aabouzaid
Copy link
Member

The integration test is not flaky, it's a reported issue in the Connectors where this issue shows in the logs:

2024-09-27T08:23:09.840Z  WARN 1 --- [lt-executor-141] io.camunda.zeebe.client.job.poller       : Failed to activate jobs for worker HTTP REST and job type io.camunda:http-json:1

io.grpc.StatusRuntimeException: CANCELLED
	at io.grpc.Status.asRuntimeException(Status.java:533)
	at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:481)
	at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:564)
	at io.grpc.internal.ClientCallImpl.access$100(ClientCallImpl.java:72)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:729)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:710)
	at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
	at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)
Caused by: java.io.IOException: Failed while requesting access token with status code 401 and message Unauthorized.
	at io.camunda.zeebe.client.impl.oauth.OAuthCredentialsProvider.fetchCredentials(OAuthCredentialsProvider.java:157)
	at io.camunda.zeebe.client.impl.oauth.OAuthCredentialsCache.computeIfMissingOrInvalid(OAuthCredentialsCache.java:100)
	at io.camunda.zeebe.client.impl.oauth.OAuthCredentialsProvider.applyCredentials(OAuthCredentialsProvider.java:79)
	at io.camunda.zeebe.client.impl.ZeebeCallCredentials.lambda$applyRequestMetadata$0(ZeebeCallCredentials.java:49)
	... 3 common frames omitted

The issue is not just for that Connectors worker; there are many others in the logs with the same error.
Restarting the Pod fixes the issue, which makes it more likely related to the app retry logic.

@aabouzaid aabouzaid changed the title [ISSUE] <TEST-Check-Connectors-webhook Integration Test Is Flaky> [ISSUE] TEST-Check-Connectors-webhook Integration Test Is Flaky Sep 27, 2024
@aabouzaid
Copy link
Member

I've disabled the Connectors test for 8.6 chart until the bug is fixed:
5784bc5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/issue Unidentified issue, it could be a bug, misconfig, or anything in between platform/aws Issues related to AWS platform/gcp Issues related to GCP
Projects
None yet
Development

No branches or pull requests

3 participants