Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load-balancing exporter k8s resolver continuously invokes the OnUpdate() command in the handler #35658

Open
Tarmander opened this issue Oct 7, 2024 · 0 comments
Labels
bug Something isn't working needs triage New item requiring triage

Comments

@Tarmander
Copy link

Tarmander commented Oct 7, 2024

Component(s)

exporter/loadbalancingexporter

What happened?

Description

When configuring our load-balancing collector to target our backend collectors via the k8s resolver, we noticed that while the DNS resolution worked fine and the collectors received evenly distributed traffic, the load-balancer would consistently recycle the endpoints at a set cadence (around every 3 minutes). The endpoints would be unchanged.

We added some log statements to the k8s resolver/handler, and they revealed that the OnUpdate() function in the handler was being invoked. This would imply that some event was triggering the update, but k get endpoints opentelemetry-global-gateway-collector --watch --output-watch-events=true returned no events for several hours when ran manually.

The net result was no actual changes to the service endpoints, but the exporter would consistently dispose and construct new exporters.

Steps to Reproduce

Configure the k8s resolver to point to a service representing

Expected Result

The OnUpdate() call in k8s handler only runs when updates occur in the service endpoints pointed to by the k8s resolver.

Actual Result

OnUpdate() is invoked at a recurring frequency of around every 3 minutes, regardless of changes to the service it points to.

Collector version

v0.105.0

Environment information

Environment

OS: Ubuntu 22.04
Compiler: go1.22.6

OpenTelemetry Collector configuration

receivers:
      otlp:
        protocols:
          grpc: {}
          http: {}
    processors:
      batch:
        timeout: 1s
      memory_limiter:
        check_interval: 5s
        limit_percentage: 80
        spike_limit_percentage: 20
    exporters:
      loadbalancing:
        protocol:
          otlp:
            tls:
              insecure: true
            sending_queue:
              queue_size: 100000
              num_consumers: 25
        resolver:
          k8s:
            service: opentelemetry-global-gateway-collector-headless.opentelemetry-global-collector
    extensions:
      health_check:
        endpoint: 0.0.0.0:13133
      zpages:
        endpoint: 0.0.0.0:55679
      pprof:
        endpoint: localhost:1777
    service:
      extensions: [health_check, zpages, pprof]
      telemetry:
        logs:
          level: info
          encoding: json
        metrics:
          address: 0.0.0.0:8888
      pipelines:
        traces:
          receivers: [otlp]
          processors: [memory_limiter, batch]
          exporters: [loadbalancing]

Log output

Sample Log Output:
 {"stream":"stderr","timestamp":1727987229349,"log":{"name":"loadbalancing","ts":1.7279872293494482E9,"data_type":"traces","oldEps":"&Endpoints{ObjectMeta:{opentelemetry-global-gateway-collector-headless  opentelemetry-global-collector  6382211d-bb57-4141-8bed-165f8f002e94 2635273389 0 2024-07-18 16:38:27 +0000 UTC <nil> <nil> map[app.kubernetes.io/component:opentelemetry-collector app.kubernetes.io/instance:opentelemetry-global-collector.opentelemetry-global-gateway app.kubernetes.io/managed-by:opentelemetry-operator app.kubernetes.io/name:opentelemetry-global-gateway-collector app.kubernetes.io/part-of:opentelemetry app.kubernetes.io/version:0.105.0 operator.opentelemetry.io/collector-headless-service:Exists operator.opentelemetry.io/collector-service-type:headless service.kubernetes.io/headless:] map[endpoints.kubernetes.io/last-change-trigger-time:2024-10-02T13:28:28Z] [] [] [{kube-controller-manager Update v1 2024-10-02 13:28:28 +0000 UTC FieldsV1 {\"f:metadata\":{\"f:annotations\":{\".\":{},\"f:endpoints.kubernetes.io/last-change-trigger-time\":{}},\"f:labels\":{\".\":{},\"f:app.kubernetes.io/component\":{},\"f:app.kubernetes.io/instance\":{},\"f:app.kubernetes.io/managed-by\":{},\"f:app.kubernetes.io/name\":{},\"f:app.kubernetes.io/part-of\":{},\"f:app.kubernetes.io/version\":{},\"f:operator.opentelemetry.io/collector-headless-service\":{},\"f:operator.opentelemetry.io/collector-service-type\":{},\"f:service.kubernetes.io/headless\":{}}},\"f:subsets\":{}} }]},Subsets:[]EndpointSubset{EndpointSubset{Addresses:[]EndpointAddress{EndpointAddress{IP:10.100.148.213,TargetRef:&ObjectReference{Kind:Pod,Namespace:opentelemetry-global-collector,Name:opentelemetry-global-gateway-collector-55695567c-rgz8b,UID:61cdd493-8900-408a-a0fd-8df916f790d7,APIVersion:,ResourceVersion:,FieldPath:,},Hostname:,NodeName:*ip-10-100-157-74.ec2.internal,},EndpointAddress{IP:10.100.181.244,TargetRef:&ObjectReference{Kind:Pod,Namespace:opentelemetry-global-collector,Name:opentelemetry-global-gateway-collector-55695567c-lk86p,UID:31c74954-6e14-4fe7-8a33-9ce1e48013e2,APIVersion:,ResourceVersion:,FieldPath:,},Hostname:,NodeName:*ip-10-100-187-149.ec2.internal,},},NotReadyAddresses:[]EndpointAddress{},Ports:[]EndpointPort{EndpointPort{Name:otlp-grpc,Port:4317,Protocol:TCP,AppProtocol:*grpc,},EndpointPort{Name:otlp-http,Port:4318,Protocol:TCP,AppProtocol:*http,},},},},}","resolver":"k8s service","msg":"OnUpDate: Old endpoints > 0, deleting them from endpoints. First callback to 'resolve' invoked.","kind":"exporter","caller":"loadbalancingexporter/resolver_k8s_handler.go:60","epRemove":["10.100.148.213","10.100.181.244"],"level":"info"}}
{"stream":"stderr","timestamp":1727987229349,"log":{"name":"loadbalancing","ts":1.7279872293496487E9,"epAdd":["10.100.148.213","10.100.181.244"],"data_type":"traces","newEps":"&Endpoints{ObjectMeta:{opentelemetry-global-gateway-collector-headless  opentelemetry-global-collector  6382211d-bb57-4141-8bed-165f8f002e94 2635273389 0 2024-07-18 16:38:27 +0000 UTC <nil> <nil> map[app.kubernetes.io/component:opentelemetry-collector app.kubernetes.io/instance:opentelemetry-global-collector.opentelemetry-global-gateway app.kubernetes.io/managed-by:opentelemetry-operator app.kubernetes.io/name:opentelemetry-global-gateway-collector app.kubernetes.io/part-of:opentelemetry app.kubernetes.io/version:0.105.0 operator.opentelemetry.io/collector-headless-service:Exists operator.opentelemetry.io/collector-service-type:headless service.kubernetes.io/headless:] map[endpoints.kubernetes.io/last-change-trigger-time:2024-10-02T13:28:28Z] [] [] [{kube-controller-manager Update v1 2024-10-02 13:28:28 +0000 UTC FieldsV1 {\"f:metadata\":{\"f:annotations\":{\".\":{},\"f:endpoints.kubernetes.io/last-change-trigger-time\":{}},\"f:labels\":{\".\":{},\"f:app.kubernetes.io/component\":{},\"f:app.kubernetes.io/instance\":{},\"f:app.kubernetes.io/managed-by\":{},\"f:app.kubernetes.io/name\":{},\"f:app.kubernetes.io/part-of\":{},\"f:app.kubernetes.io/version\":{},\"f:operator.opentelemetry.io/collector-headless-service\":{},\"f:operator.opentelemetry.io/collector-service-type\":{},\"f:service.kubernetes.io/headless\":{}}},\"f:subsets\":{}} }]},Subsets:[]EndpointSubset{EndpointSubset{Addresses:[]EndpointAddress{EndpointAddress{IP:10.100.148.213,TargetRef:&ObjectReference{Kind:Pod,Namespace:opentelemetry-global-collector,Name:opentelemetry-global-gateway-collector-55695567c-rgz8b,UID:61cdd493-8900-408a-a0fd-8df916f790d7,APIVersion:,ResourceVersion:,FieldPath:,},Hostname:,NodeName:*ip-10-100-157-74.ec2.internal,},EndpointAddress{IP:10.100.181.244,TargetRef:&ObjectReference{Kind:Pod,Namespace:opentelemetry-global-collector,Name:opentelemetry-global-gateway-collector-55695567c-lk86p,UID:31c74954-6e14-4fe7-8a33-9ce1e48013e2,APIVersion:,ResourceVersion:,FieldPath:,},Hostname:,NodeName:*ip-10-100-187-149.ec2.internal,},},NotReadyAddresses:[]EndpointAddress{},Ports:[]EndpointPort{EndpointPort{Name:otlp-grpc,Port:4317,Protocol:TCP,AppProtocol:*grpc,},EndpointPort{Name:otlp-http,Port:4318,Protocol:TCP,AppProtocol:*http,},},},},}","resolver":"k8s service","msg":"OnUpDate: endpoint changes detected, second callback to 'resolve' invoked.","kind":"exporter","caller":"loadbalancingexporter/resolver_k8s_handler.go:77","level":"info"}}

Additional context

No response

@Tarmander Tarmander added bug Something isn't working needs triage New item requiring triage labels Oct 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage New item requiring triage
Projects
None yet
Development

No branches or pull requests

1 participant