You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Datadog Operator Version: 2.5.1 (upgraded from 2.5.0)
Datadog Agent Version: 7.61.0 (upgraded from 7.60.1)
Datadog Cluster Agent Version: 7.61.0 (upgraded from 7.60.1)
Platform: Kubernetes (deployed via Helm)
Error logs that might be relevant:
Cluster Agent
2025-01-15 09:00:21 UTC | CLUSTER | ERROR | (pkg/orchestrator/redact/pod.go:109 in func1) | Failed to parse cmd from pod, obscuring whole command
2025-01-15 09:00:21 UTC | CLUSTER | ERROR | (pkg/orchestrator/redact/pod.go:109 in func1) | Failed to parse cmd from pod, obscuring whole command
2025-01-15 09:00:21 UTC | CLUSTER | ERROR | (pkg/orchestrator/redact/pod.go:109 in func1) | Failed to parse cmd from pod, obscuring whole command
2025-01-15 09:00:22 UTC | CLUSTER | ERROR | (pkg/orchestrator/redact/pod.go:109 in func1) | Failed to parse cmd from pod, obscuring whole command
2025-01-15 09:00:22 UTC | CLUSTER | ERROR | (pkg/orchestrator/redact/pod.go:109 in func1) | Failed to parse cmd from pod, obscuring whole command
2025-01-15 09:00:22 UTC | CLUSTER | ERROR | (pkg/orchestrator/redact/pod.go:109 in func1) | Failed to parse cmd from pod, obscuring whole command
2025-01-15 08:58:27 UTC | CLUSTER | ERROR | (pkg/collector/corechecks/loader.go:73 in Load) | core.loader: could not configure check kubernetes_state_core: resource customresourcedefinitions does not exist. Available resources: apiextensions.k8s.io/v1, Resource=customresourcedefinitions,clusterroles,configmaps,deployments,leases,replicationcontrollers,volumeattachments,cronjobs,ingresses,ingressclasses,persistentvolumes,poddisruptionbudgets,validatingwebhookconfigurations,core/v1, Resource=pods_extended,autoscaling.k8s.io/v1beta2, Resource=verticalpodautoscalers,daemonsets,horizontalpodautoscalers,jobs,resourcequotas,rolebindings,batch/v1, Resource=jobs_extended,endpointslices,services,storageclasses,core/v1, Resource=nodes_extended,statefulsets,endpoints,mutatingwebhookconfigurations,persistentvolumeclaims,replicasets,roles,serviceaccounts,namespaces,apiregistration.k8s.io/v1, Resource=apiservices,clusterrolebindings,limitranges,networkpolicies,pods,certificatesigningrequests,nodes,secrets
Agent
2025-01-20 17:51:14 UTC | CORE | ERROR | (pkg/collector/worker/check_logger.go:71 in Error) | check:redisdb | Error running check: [{"message":"Error while reading from 10.2.5.55:6379 : (104, 'Connection reset by peer')","traceback":"Traceback (most recent call last):\n File \"/opt/datadog-agent/embedded/lib/python3.12/site-packages/redis/connection.py\", line 512, in read_response\n response = self._parser.read_response(disable_decoding=disable_decoding)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/opt/datadog-agent/embedded/lib/python3.12/site-packages/redis/_parsers/resp2.py\", line 15, in read_response\n result = self._read_response(disable_decoding=disable_decoding)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/opt/datadog-agent/embedded/lib/python3.12/site-packages/redis/_parsers/resp2.py\", line 25, in _read_response\n raw = self._buffer.readline()\n ^^^^^^^^^^^^^^^^^^^^^^^\n File \"/opt/datadog-agent/embedded/lib/python3.12/site-packages/redis/_parsers/socket.py\", line 115, in readline\n self._read_from_socket()\n File \"/opt/datadog-agent/embedded/lib/python3.12/site-packages/redis/_parsers/socket.py\", line 65, in _read_from_socket\n data = self._sock.recv(socket_read_size)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\nConnectionResetError: [Errno 104] Connection reset by peer\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/opt/datadog-agent/embedded/lib/python3.12/site-packages/datadog_checks/base/checks/base.py\", line 1290, in run\n self.check(instance)\n File \"/opt/datadog-agent/embedded/lib/python3.12/site-packages/datadog_checks/redisdb/redisdb.py\", line 552, in check\n self._check_db()\n File \"/opt/datadog-agent/embedded/lib/python3.12/site-packages/datadog_checks/redisdb/redisdb.py\", line 207, in _check_db\n conn.ping()\n File \"/opt/datadog-agent/embedded/lib/python3.12/site-packages/redis/commands/core.py\", line 1208, in ping\n return self.execute_command(\"PING\", **kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/opt/datadog-agent/embedded/lib/python3.12/site-packages/redis/client.py\", line 545, in execute_command\n conn = self.connection or pool.get_connection(command_name, **options)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/opt/datadog-agent/embedded/lib/python3.12/site-packages/redis/connection.py\", line 1074, in get_connection\n connection.connect()\n File \"/opt/datadog-agent/embedded/lib/python3.12/site-packages/redis/connection.py\", line 289, in connect\n self.on_connect()\n File \"/opt/datadog-agent/embedded/lib/python3.12/site-packages/redis/connection.py\", line 391, in on_connect\n self.read_response()\n File \"/opt/datadog-agent/embedded/lib/python3.12/site-packages/redis/connection.py\", line 520, in read_response\n raise ConnectionError(\nredis.exceptions.ConnectionError: Error while reading from 10.2.5.55:6379 : (104, 'Connection reset by peer')\n"}]
Describe what happened:
After upgrading the Datadog Operator and related components, Kubernetes cluster metrics stopped appearing on Datadog dashboards.
Describe what you expected:
Kubernetes state cluster metrics should continue to appear on Datadog dashboards after upgrading the components.
Steps to reproduce the issue:
Upgrade the following components:
Datadog Operator: 2.5.0 -> 2.5.1
Datadog Agent: 7.60.1 -> 7.61.0
Datadog Cluster Agent: 7.60.1 -> 7.61.0
Observe missing Kubernetes state metrics on Datadog dashboards.
Thank you for reaching out and reporting this issue.
We took a look and we confirm this issue. We have a fix ready here that we are going to ship with Agent 7.62.0 early next week. In the meantime, the recommendation is to pin the Agent version to 7.60.1, as @julien-redel mentioned.
Agent Environment
Datadog Operator Version: 2.5.1 (upgraded from 2.5.0)
Datadog Agent Version: 7.61.0 (upgraded from 7.60.1)
Datadog Cluster Agent Version: 7.61.0 (upgraded from 7.60.1)
Platform: Kubernetes (deployed via Helm)
Error logs that might be relevant:
Cluster Agent
Agent
Describe what happened:
After upgrading the Datadog Operator and related components, Kubernetes cluster metrics stopped appearing on Datadog dashboards.
Describe what you expected:
Kubernetes state cluster metrics should continue to appear on Datadog dashboards after upgrading the components.
Steps to reproduce the issue:
Additional environment details (Operating System, Cloud provider, etc):
Kubernetes cluster running on AKS.
The text was updated successfully, but these errors were encountered: