Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics are not being collected after updating from 3.18.0 to 4.5.0 #3597

Closed
rodrisbach opened this issue Mar 7, 2024 · 1 comment
Closed
Labels
bug Something isn't working

Comments

@rodrisbach
Copy link

Describe the bug
I updated the sumologic-kubernetes-collection solution from 3.18.0 to 4.5.0 using Terraform (I'm running Kubernetes on EKS). The update seemed to be successful

The pod 'sumologic-opentelemetry-operator' throws an error, fails, and it's restarted. There are no metrics on Sumo Logic's UI, and the Kubernetes cluster is not listed in the Explore section. I tried deleting the current installation and the namespace, and then I installed the Helm chart, but it didn't work. I removed all the custom values, but it had the same behavior. I tried different versions like 4.0, 4.1, etc, but none of them worked, same error. I think I have an issue related to the version of the cluster maybe, but I'm not sure.

Logs

sumologic-opentelemetry-operator pod error:

{"level":"info","ts":"2024-03-07T18:55:59Z","logger":"collector-upgrade","msg":"no instances to upgrade"}
{"level":"info","ts":"2024-03-07T18:55:59Z","msg":"Starting workers","controller":"opampbridge","controllerGroup":"opentelemetry.io","controllerKind":"OpAMPBridge","worker count":1}
{"level":"info","ts":"2024-03-07T18:55:59Z","msg":"Starting workers","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","worker count":1}
{"level":"info","ts":"2024-03-07T18:55:59Z","logger":"controllers.OpenTelemetryCollector","msg":"pdb field is unset in Spec, skipping podDisruptionBudget creation"}
{"level":"info","ts":"2024-03-07T18:55:59Z","msg":"Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","OpenTelemetryCollector":{"name":"sumologic-sumologic-metrics","namespace":"sumologic"},"namespace":"sumologic","name":"sumologic-sumologic-metrics","reconcileID":"d537acde-3a5b-4dd5-bd23-ba6fac75f0fb"}
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x268005d]

goroutine 376 [running]:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
	/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:116 +0x1e5
panic({0x2c27860?, 0x52f65a0?})
	/opt/hostedtoolcache/go/1.21.6/x64/src/runtime/panic.go:914 +0x21f
github.com/open-telemetry/opentelemetry-operator/internal/manifests/collector.HorizontalPodAutoscaler({{0x39a5d00, 0xc0003f78c0}, {0x39924e0, 0xc000719740}, 0xc0004616c0, {{0x3999788, 0xc000551170}, 0x0}, {{{0x27960be, 0x16}, ...}, ...}, ...})
	/home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/horizontalpodautoscaler.go:99 +0x67d
github.com/open-telemetry/opentelemetry-operator/internal/manifests/collector.Build.FactoryWithoutError[...].func7()
	/home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/builder.go:31 +0x35
github.com/open-telemetry/opentelemetry-operator/internal/manifests/collector.Build({{0x39a5d00, 0xc0003f78c0}, {0x39924e0, 0xc000719740}, 0xc0004616c0, {{0x3999788, 0xc000551170}, 0x0}, {{{0x27960be, 0x16}, ...}, ...}, ...})
	/home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/collector.go:71 +0xa3d
github.com/open-telemetry/opentelemetry-operator/controllers.BuildCollector({{0x39a5d00, 0xc0003f78c0}, {0x39924e0, 0xc000719740}, 0xc0004616c0, {{0x3999788, 0xc000551170}, 0x0}, {{{0x27960be, 0x16}, ...}, ...}, ...})
	/home/runner/work/opentelemetry-operator/opentelemetry-operator/controllers/common.go:54 +0x10a
github.com/open-telemetry/opentelemetry-operator/controllers.(*OpenTelemetryCollectorReconciler).Reconcile(0xc000122000, {0x3994780, 0xc0006bebd0}, {{{0xc0011464d6, 0x9}, {0xc000c24980, 0x1b}}})
	/home/runner/work/opentelemetry-operator/opentelemetry-operator/controllers/opentelemetrycollector_controller.go:124 +0x44b
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x3999788?, {0x3994780?, 0xc0006bebd0?}, {{{0xc0011464d6?, 0xb?}, {0xc000c24980?, 0x0?}}})
	/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:119 +0xb7
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc00047ff40, {0x39947b8, 0xc0004a2c30}, {0x2e0da00?, 0xc0007096c0?})
	/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:316 +0x3cc
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc00047ff40, {0x39947b8, 0xc0004a2c30})
	/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266 +0x1af
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
	/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227 +0x79
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 in goroutine 220
	/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:223 +0x565

Command used to install/upgrade Collection Direct command which you used to install/upgrade collection.

resource "helm_release" "sumologic-collection" {
  name             = "sumologic"
  repository       = "https://sumologic.github.io/sumologic-kubernetes-collection"
  chart            = "sumologic"
  version          = "4.5.0"
  create_namespace = true
  namespace        = "sumologic"
  force_update     = true

  set_sensitive {
    name  = "sumologic.accessId"
    value = *******
  }
  set_sensitive {
    name  = "sumologic.accessKey"
    value = ******
  }
  set {
    name  = "sumologic.clusterName"
    value = var.cluster_name
  }
}
  • Collection version (e.g. helm ls -n sumologic):
helm ls
sumologic	sumologic	1       	2024-03-07 15:46:28.206067 -0300 -03	deployed	sumologic-4.5.0	4.5.0
  • Kubernetes version (e.g. kubectl version):
kubectl version
Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.2", GitCommit:"7f6f68fdabc4df88cfea2dcf9a19b2b830f1e647", GitTreeState:"clean", BuildDate:"2023-05-17T14:13:27Z", GoVersion:"go1.20.4", Compiler:"gc", Platform:"darwin/arm64"}
Kustomize Version: v5.0.1
Server Version: version.Info{Major:"1", Minor:"25+", GitVersion:"v1.25.16-eks-77b1e4e", GitCommit:"c39acac7636499dd73442b62e11951592aedd1f8", GitTreeState:"clean", BuildDate:"2024-01-13T01:51:36Z", GoVersion:"go1.20.10", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider: AWS
@rodrisbach rodrisbach added the bug Something isn't working label Mar 7, 2024
@rodrisbach
Copy link
Author

I just updated the Kubernetes cluster version from 1.25 to 1.26. Then I installed the Sumo Logic Helm chart and it worked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant