-
Notifications
You must be signed in to change notification settings - Fork 440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenTelemetryCollector fails if prometheus-operator is installed before opentelemetry-operator #1700
Comments
I don't think I understand how the prometheus operator plays in to this, however, I recently did a thread in the helm-charts repo detailing this exact scenario. Could you detail where the prometheus operator comes in to play here? If not, attempt setting the failPolicy on the webhook to Ignore which should resolve your issue. |
I'm also seeing this same behavior. For what it's worth, setting the failure policy to ignore isn't really ideal either, but I understand if it's the only known workaround right now. It looks like the mutating webhook has a fairly short list of mutations, so probably best to "manually mutate" and ensure the fields exist prior to creating. Some additional diagnostics here as well. Oddly enough, it doesn't even look like any traffic is getting to the otel operator during these times. There are no |
Yeah i'm not aware of any prometheus operations that would block this. Until we have some more details here, i'm not sure I can debug more currently. |
Same here, installing: helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm upgrade --install prometheus -n prometheus --create-namespace prometheus-community/kube-prometheus-stack causes error while installing opentelemetry operator
|
I tried reproducing this last year and was unable to. Are there any operator logs you can share? Otherwise, i've seen this issue when you haven't installed cert-manager, are using mismatched operator versions, or have a firewall blocking connections. |
Unfortunatelly I hit the same behavior. client.go:414: cluster AKS, inside nothing but kube-prometheus-stack and otel operator + collector. On my local k3d cluster it just works. on aks not at all I've finally found a soultion ( mentioned somewhere earlier ): opentelemetry-operator: But let's be honest; this thing should not behave like this |
It might be something else. During my trials I've removed operator dependency from my helm and then installed it manually to be present in the cluster before anything else gets deployed. Then I deployed helm with just a collector and my app. The result was negative as described. Then I gave up and applied failurePolicy as mentioned. This problem is specific to AKS. Cannot replicate it anywhere else. |
Hm... very odd. I'm sorry you experienced that issue. I don't have easy access to AKS unfortunately and would be challenging for me to replicate as a result. This may imply something is wrong with the AKS interaction with cert manager. Would you mind opening a separate issue containing the steps you used to reproduce that? |
When applying the OpenTelemetryCollector template:
if prometheus-operator is installed before openetelemetry-operator, the application will fail with the following error:
If opentelemetry-operator is installed before prometheus-operator, everything is fine.
I'm running kind 0.18 on Windows 10 with podman 4.4.4.
The text was updated successfully, but these errors were encountered: