Standardize traffic metrics across traffic sources #354

tomkerkhove · 2021-12-16T09:09:18Z

As per #350 our goal is to support various traffic flavors and support them in a neutral way.

The tricky thing here is that there is no standard way of having metrics for these traffic flows and we have to provide our own, use metrics from ingress controllers, or rely on Service Mesh Interface's Traffic Metrics API.

We should move to a standardized approach so that:

We define and use standard for traffic metrics to have a unified approach, regardless of the traffic type/source
- In a later stage, we can propose this metric to TAG Network as an open standard beyond the scope of KEDA
We unify all these metrics into a central place and use that as input for our external scaler
Change our interceptor so that it publishes the metrics it has today in to our central place above
- There is a chance that we could remove this from the interceptor, but most likely we will still need it for our service-to-service support

Proposal

OpenTelemetry Metrics are about to go stable and is an open standard for using metrics in systems.

Standardizing on OpenTelemetry & its Collector

Our interceptor should be changed so that it can publish its metrics to an OpenTelemetry Collector so that we can bring the metrics where we need them and end-users can re-use these metrics for their own purposes:

These metrics should comply with the defined HTTP semantics as per this doc.

Once the metrics are available, we can choose one of the existing exporters (full overview) to consume the metrics by pushing metrics to our external scaler directly (HTTP-based or gRPC-based, preferred approach) or through an external system such as Prometheus (less preferred).

When end-users install the HTTP add-on, we should automatically install a collector, unless they opt-out and configure a different endpoint. However, ideally, we fully manage and configure the collector with all the bells and whistles that we need.

Bringing existing traffic metrics into our standardized metrics approach

In order to bring existing traffic metrics into our way of working we will need two components:

An adapter per traffic source to pull the metrics and make them available in the collector
- Some traffic sources might already be supported through an existing receiver
A custom processor to transform the source metrics format to our standardized metrics format (learn more)

Some traffic sources might already be supported through an existing receiver.

For example, it would make sense to have an SMI-receiver that we can rely on instead of rolling our own. (servicemeshinterface/smi-spec#199)

Traffic Metrics Spec

SMI has its Traffic Metrics spec and OpenTelemetry is defining semantics for HTTP metrics.

We should aim to use those before rolling our own standard.

The text was updated successfully, but these errors were encountered:

arschles · 2022-01-10T23:23:34Z

@tomkerkhove I really like the idea of having a KEDA-wide standard for HTTP metrics based on the OTEL semantics, so +1 to that. One thing that we should consider having, though, is that the interceptors should be able to push some (primitive) metrics down to external scalers. Doing so would allow for a completely push-based notification system from interceptor -> external scaler -> KEDA itself, and gives us the capability of scaling from zero more quickly.

tomkerkhove · 2022-01-11T10:56:55Z

One thing that we should consider having, though, is that the interceptors should be able to push some (primitive) metrics down to external scalers. Doing so would allow for a completely push-based notification system from interceptor -> external scaler -> KEDA itself, and gives us the capability of scaling from zero more quickly.

Can you elaborate a bit more what you want to achieve here? I presume you mean the external scaler of HTTP add-on then or?

arschles · 2022-01-27T20:04:19Z

@tomkerkhove my basic ask is to reduce the latency between when a request comes into the cluster and when the external scaler (and thus, KEDA) knows about it. I'd love to see whether we can design something to push appropriate metrics from "edge" (ingress controllers and/or service meshes) to external scaler

stale · 2022-03-28T21:23:14Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

JorTurFer · 2024-02-03T12:39:03Z

Could this be related? #910

tomkerkhove · 2024-02-05T06:52:37Z

That one feels more to operator HTTP add-on rather than a source of scaling, no?

abebars · 2024-07-06T16:25:26Z

@JorTurFer Any Plans for this work to be executed soon. I think it will add much to the Http Add On Traffic to avoid adding another Http Proxy (Interceptor) into the mix?

JorTurFer · 2024-09-02T21:07:24Z

@JorTurFer Any Plans for this work to be executed soon. I think it will add much to the Http Add On Traffic to avoid adding another Http Proxy (Interceptor) into the mix?

WDYT? Improving the metrics won't get rid of the interceptor requirement, as the interceptor is the key stone that enables scaling from/to 0

tomkerkhove added epic traffic-sources All issues related to where HTTP traffic can come from labels Dec 16, 2021

tomkerkhove mentioned this issue Dec 16, 2021

SMI Metrics and OpenTelemetry servicemeshinterface/smi-spec#199

Closed

stale bot added the stale All issues that are marked as stale due to inactivity label Mar 28, 2022

arschles added the stale-bot-ignore All issues that should not be automatically closed by our stale bot label Mar 29, 2022

stale bot removed the stale All issues that are marked as stale due to inactivity label Mar 29, 2022

tomkerkhove added this to Roadmap - KEDA HTTP Add-On May 11, 2022

tomkerkhove moved this to To Do in Roadmap - KEDA HTTP Add-On May 11, 2022

JorTurFer mentioned this issue Apr 4, 2024

Observability for all the components #965

Closed

5 tasks

JorTurFer added the help wanted Extra attention is needed label Apr 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Standardize traffic metrics across traffic sources #354

Standardize traffic metrics across traffic sources #354

tomkerkhove commented Dec 16, 2021

arschles commented Jan 10, 2022

tomkerkhove commented Jan 11, 2022

arschles commented Jan 27, 2022

stale bot commented Mar 28, 2022

JorTurFer commented Feb 3, 2024

tomkerkhove commented Feb 5, 2024

abebars commented Jul 6, 2024

JorTurFer commented Sep 2, 2024

Standardize traffic metrics across traffic sources #354

Standardize traffic metrics across traffic sources #354

Comments

tomkerkhove commented Dec 16, 2021

Proposal

Standardizing on OpenTelemetry & its Collector

Bringing existing traffic metrics into our standardized metrics approach

Traffic Metrics Spec

arschles commented Jan 10, 2022

tomkerkhove commented Jan 11, 2022

arschles commented Jan 27, 2022

stale bot commented Mar 28, 2022

JorTurFer commented Feb 3, 2024

tomkerkhove commented Feb 5, 2024

abebars commented Jul 6, 2024

JorTurFer commented Sep 2, 2024