-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v2] Use health check extension in e2e test instead of /metrics #5859
Comments
**Which problem is this PR solving?** Part of #5633, part of #5859 **Description of the changes** * Integrate health check extension to monitor and report Jaeger V2 component's health * Enhance all-in-one CI test to ping the new health port **How was this change tested?** The changes were tested by running the following command: ```bash make test ``` ```bash CI actions and new Unit Tests ``` **Checklist** - [x] I have read [CONTRIBUTING_GUIDELINES.md](https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md) - [x] I have signed all commits - [x] I have added unit tests for the new functionality - [x] I have run lint and test steps successfully - `for jaeger: make lint test` - `for jaeger-ui: yarn lint` and `yarn test` --------- Signed-off-by: Wise-Wizard <[email protected]> Signed-off-by: Yuri Shkuro <[email protected]> Co-authored-by: Yuri Shkuro <[email protected]> Co-authored-by: Yuri Shkuro <[email protected]>
Current state: in #5861 all configs were added, but grpc-storage test failed because additional traces from WriteSpan endpoint are being generated (a big no-no). I am going to merge #5861 with a temporary override in grpc_test to continue using the original query service port for health check, but we need to identify the problem and remove the override for healthcheck endpoint. Surprisingly just switching to query port fixes the issue, while querying 13133 results in extra spans. My hypothesis is that the healthcheck endpoint is registered with tracing enabled, so when we hit if from the test it generates a trace from within the collector, and writing that trace generates the other traces for WriteSpan endpoint that we're seeing. We need to make sure that we do not let OTEL framework instantiate a tracer that indiscriminately traces everything. cc @Wise-Wizard |
Reproducer in #5861 (comment) |
…5861) **Which problem is this PR solving?** Part of jaegertracing#5633, part of jaegertracing#5859 **Description of the changes** * Integrate health check extension to monitor and report Jaeger V2 component's health * Enhance all-in-one CI test to ping the new health port **How was this change tested?** The changes were tested by running the following command: ```bash make test ``` ```bash CI actions and new Unit Tests ``` **Checklist** - [x] I have read [CONTRIBUTING_GUIDELINES.md](https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md) - [x] I have signed all commits - [x] I have added unit tests for the new functionality - [x] I have run lint and test steps successfully - `for jaeger: make lint test` - `for jaeger-ui: yarn lint` and `yarn test` --------- Signed-off-by: Wise-Wizard <[email protected]> Signed-off-by: Yuri Shkuro <[email protected]> Co-authored-by: Yuri Shkuro <[email protected]> Co-authored-by: Yuri Shkuro <[email protected]> Signed-off-by: Jared Tan <[email protected]>
…5861) **Which problem is this PR solving?** Part of jaegertracing#5633, part of jaegertracing#5859 **Description of the changes** * Integrate health check extension to monitor and report Jaeger V2 component's health * Enhance all-in-one CI test to ping the new health port **How was this change tested?** The changes were tested by running the following command: ```bash make test ``` ```bash CI actions and new Unit Tests ``` **Checklist** - [x] I have read [CONTRIBUTING_GUIDELINES.md](https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md) - [x] I have signed all commits - [x] I have added unit tests for the new functionality - [x] I have run lint and test steps successfully - `for jaeger: make lint test` - `for jaeger-ui: yarn lint` and `yarn test` --------- Signed-off-by: Wise-Wizard <[email protected]> Signed-off-by: Yuri Shkuro <[email protected]> Co-authored-by: Yuri Shkuro <[email protected]> Co-authored-by: Yuri Shkuro <[email protected]> Signed-off-by: Mahad Zaryab <[email protected]>
It looks like most of the work here is done, but there is still one location that queries a different endpoint
|
@yurishkuro how should we go about not generating the extra traces from the healthcheckextension? |
are we generating them now? |
@yurishkuro yeah I ran your reproducer code from the original PR and it looks to have the same failure |
Reproducer and fix #6113 |
We don't generate extra trace from health extension, it's coming from grpc storage (issue #5971). |
<!-- !! Please DELETE this comment before posting. We appreciate your contribution to the Jaeger project! 👋🎉 --> ## Which problem is this PR solving? - Fixes #5971 - Towards #6113 and #5859 ## Description of the changes - This PR fixes an issue where the GRPC remote storage client was provided a tracer which was resulting in an infinite loop of trace generation. This infinite loop would happen when we would try to write a trace to storage which would generate a new trace that needed to be written and so on. This PR provides a fix for this by using a noop tracer for the writer clients so that we do not generate traces on the write paths but still do so when reading. - This is likely just a temporary fix and we'll want to monitor open-telemetry/opentelemetry-collector#10663 for a better long-term fix. ## How was this change tested? - Added the healthcheck endpoint which was previously failing in #6113. ## Checklist - [x] I have read https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md - [x] I have signed all commits - [x] I have added unit tests for the new functionality - [x] I have run lint and test steps successfully - for `jaeger`: `make lint test` - for `jaeger-ui`: `yarn lint` and `yarn test` ## Co-Authors This PR is a continuation of #5979 Co-authored-by: cx <[email protected]> --------- Signed-off-by: Mahad Zaryab <[email protected]>
In the current e2e tests we are using the metrics endpoint to check that the v2 binary is up and ready for tests:
Since we already introduced a health check extension (#5831), we should be using that instead of /metrics.
Changes required:
The text was updated successfully, but these errors were encountered: