Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LS-59631] add opamp bridge to otel-cloud-stack chart #76

Merged
merged 26 commits into from
Mar 20, 2024

Conversation

jdcrouse
Copy link
Contributor

@jdcrouse jdcrouse commented Mar 13, 2024

Description

This adds an opamp bridge to the otel-cloud-stack chart. The same chart will be duplicated into kube-otel-stack in a followup.

The bridge is configured from a section in values.yaml called opAMPBridge.
The bridge is enabled by setting Values.opAMPBridge.enabled to true, and is false by default.

  • When addReportingLabel is set on the bridge, all collectors will be tagged with opentelemetry.io/opamp-reporting: - true which makes the bridge report on those collectors but does not allow it to modify them.
  • When addManagedLabel is set on the bridge, all collectors will be tagged with opentelemetry.io/opamp-managed: true which makes the bridge report on those collectors and allows it to modify them.
  • If both values are set to true in the bridge config, we only set the opamp-reporting tag to true on the collectors.

For docs team

The opamp bridge monitors a customer's collectors and provides health updates to Lightstep. It will eventually allow them to remotely configure their collectors. This is related to the Monitor your collectors section in the kube quickstart doc.

Customers can optionally include the OpAMP Bridge by setting opAMPBridge.enabled to true when helm installing the otel-cloud-stack chart. They need to create a project-scoped Lightstep API key with the OpAMP Agent role for their bridge, and set it as a kubernetes secret called otel-opamp-bridge-secret with the key LS_OPAMP_API_KEY. This will automatically be set as an env variable in their OpAMP Bridge to be passed through to authentication.

There are two relevant collector tags that can be modified with this chart:

  • opentelemetry.io/opamp-reporting, when set to true on a collector, tells the OpAMP Bridge to report the health of this collector but disallows it from modifying the collector config
  • opentelemetry.io/opamp-managed, when set to true on a collector, tells the OpAMP Bridge to report the health of this collector and allows the OpAMP Bridge to modify the collector config
    Both of these should not be true on a collector, but if they are opentelemetry.io/opamp-reporting takes precedence so the OpAMP Bridge cannot modify that collector. If neither is set, the OpAMP Bridge will not report on that collector.

By setting opAMPBridge.addReportingLabel to true in this helm chart, all collectors installed by this chart will be tagged with opentelemetry.io/opamp-reporting: true. This is set to true by default in this chart.

Customers can set opAMPBridge.addReportingLabel to false and opAMPBridge.addManagedLabel to true to instead tag all collectors installed by this chart with opentelemetry.io/opamp-managed: true.

Unfortunately I don't think there are good OpAMP Bridge docs to direct customers to if they want more information

How Has This Been Tested?

to show this is backwards compatible, I ran a bridge-less install on both the old chart and updated chart. The diffs are below and you can see they are only version changes.

command:

helm template test-stack otel-cloud-stack -f otel-cloud-stack/values.yaml --set clusterName='jason-test' --set=tracesCollector.enabled=true --set=logsCollector.enabled=true

diff:

> diff helm-otel-cloud-stack-old.yaml helm-otel-cloud-stack-new.yaml
340c340
<     app.kubernetes.io/version: "0.3.1"
---
>     app.kubernetes.io/version: "0.4.0"
342c342
<     chart: otel-cloud-stack-0.3.1
---
>     chart: otel-cloud-stack-0.4.0
764c764
<     app.kubernetes.io/version: "0.3.1"
---
>     app.kubernetes.io/version: "0.4.0"
766c766
<     chart: otel-cloud-stack-0.3.1
---
>     chart: otel-cloud-stack-0.4.0
936c936
<     app.kubernetes.io/version: "0.3.1"
---
>     app.kubernetes.io/version: "0.4.0"
938c938
<     chart: otel-cloud-stack-0.3.1
---
>     chart: otel-cloud-stack-0.4.0
1095c1095
<     app.kubernetes.io/version: "0.3.1"
---
>     app.kubernetes.io/version: "0.4.0"
1097c1097
<     chart: otel-cloud-stack-0.3.1
---
>     chart: otel-cloud-stack-0.4.0

diff between old helm install and new helm install with the bridge enabled:

> diff helm-otel-cloud-stack-base.yaml helm-otel-cloud-stack-bridge.yaml
1a2,21
> # Source: otel-cloud-stack/templates/bridge.yaml
> apiVersion: rbac.authorization.k8s.io/v1
> kind: ClusterRole
> metadata:
>   name: otel-cloud-stack-bridge
> rules:
>   - apiGroups:
>       - opentelemetry.io
>     resources:
>       - opentelemetrycollectors
>     verbs:
>       - "*"
>   - apiGroups:
>       - ""
>     resources:
>       - pods
>     verbs:
>       - 'list'
>       - 'get'
> ---
269a290,304
> # Source: otel-cloud-stack/templates/bridge.yaml
> apiVersion: rbac.authorization.k8s.io/v1
> kind: ClusterRoleBinding
> metadata:
>   name: otel-cloud-stack
> roleRef:
>   apiGroup: rbac.authorization.k8s.io
>   kind: ClusterRole
>   name: "otel-cloud-stack-bridge"
> subjects:
>   - kind: ServiceAccount
>     # quirk of the Operator
>     name: "otel-cloud-stack-bridge-opamp-bridge"
>     namespace: "default"
> ---
329a365,428
> # Source: otel-cloud-stack/templates/bridge.yaml
> apiVersion: opentelemetry.io/v1alpha1
> kind: OpAMPBridge
> metadata:
>   name: otel-cloud-stack
>   labels:
>     environment: default
>     app.kubernetes.io/managed-by: Helm
>     app.kubernetes.io/instance: otel-cloud-stack
>     app.kubernetes.io/version: "0.4.0"
>     app.kubernetes.io/part-of: otel-cloud-stack
>     chart: otel-cloud-stack-0.4.0
>     release: "otel-cloud-stack"
>     heritage: "Helm"
> spec:
>   endpoint: wss://opamp.lightstep.com/v1/opamp
>   headers:
>     Authorization: bearer ${LS_OPAMP_API_KEY}
>   capabilities:
>     AcceptsOpAMPConnectionSettings: true
>     AcceptsOtherConnectionSettings: true
>     AcceptsRemoteConfig: true
>     AcceptsRestartCommand: true
>     ReportsEffectiveConfig: true
>     ReportsHealth: true
>     ReportsOwnLogs: true
>     ReportsOwnMetrics: true
>     ReportsOwnTraces: true
>     ReportsRemoteConfig: true
>     ReportsStatus: true
>   image: "ghcr.io/open-telemetry/opentelemetry-operator/operator-opamp-bridge:0.96.0"
>   env:
>     - name: OTEL_K8S_NODE_NAME
>       valueFrom:
>         fieldRef:
>           fieldPath: spec.nodeName
>     - name: OTEL_K8S_NAMESPACE
>       valueFrom:
>         fieldRef:
>           apiVersion: v1
>           fieldPath: metadata.namespace
>     - name: OTEL_K8S_POD_NAME
>       valueFrom:
>         fieldRef:
>           apiVersion: v1
>           fieldPath: metadata.name
>     - name: OTEL_K8S_POD_UID
>       valueFrom:
>         fieldRef:
>           apiVersion: v1
>           fieldPath: metadata.uid
>     - name: OTEL_K8S_POD_IP
>       valueFrom:
>         fieldRef:
>           apiVersion: v1
>           fieldPath: status.podIP
>     - name: OTEL_RESOURCE_ATTRIBUTES
>       value: "k8s.cluster.name=jason-test"
>     - name: LS_OPAMP_API_KEY
>       valueFrom:
>         secretKeyRef:
>           key: LS_OPAMP_API_KEY
>           name: otel-opamp-bridge-secret
> ---
336c435,436
<     app: otel-cloud-stack-daemon
---
>     app: otel-cloud-stack-daemon
>     opentelemetry.io/opamp-reporting: "true"
340c440
<     app.kubernetes.io/version: "0.3.1"
---
>     app.kubernetes.io/version: "0.4.0"
342c442
<     chart: otel-cloud-stack-0.3.1
---
>     chart: otel-cloud-stack-0.4.0
760c860,861
<     app: otel-cloud-stack-cluster-stats
---
>     app: otel-cloud-stack-cluster-stats
>     opentelemetry.io/opamp-reporting: "true"
764c865
<     app.kubernetes.io/version: "0.3.1"
---
>     app.kubernetes.io/version: "0.4.0"
766c867
<     chart: otel-cloud-stack-0.3.1
---
>     chart: otel-cloud-stack-0.4.0
932c1033,1034
<     app: otel-cloud-stack-traces
---
>     app: otel-cloud-stack-traces
>     opentelemetry.io/opamp-reporting: "true"
936c1038
<     app.kubernetes.io/version: "0.3.1"
---
>     app.kubernetes.io/version: "0.4.0"
938c1040
<     chart: otel-cloud-stack-0.3.1
---
>     chart: otel-cloud-stack-0.4.0
1091c1193,1194
<     app: otel-cloud-stack-logs
---
>     app: otel-cloud-stack-logs
>     opentelemetry.io/opamp-reporting: "true"
1095c1198
<     app.kubernetes.io/version: "0.3.1"
---
>     app.kubernetes.io/version: "0.4.0"
1097c1200
<     chart: otel-cloud-stack-0.3.1
---
>     chart: otel-cloud-stack-0.4.0

@jdcrouse jdcrouse requested a review from a team March 13, 2024 16:24
@@ -15,7 +15,7 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: "0.3.1"
version: "0.4.0"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given a version number MAJOR.MINOR.PATCH, increment the:

MAJOR version when you make incompatible API changes
MINOR version when you add functionality in a backward compatible manner
PATCH version when you make backward compatible bug fixes

this is adding functionality in a backward compatible manner so i increased the minor version

@jdcrouse jdcrouse requested a review from a team March 13, 2024 17:06
charts/otel-cloud-stack/values.yaml Outdated Show resolved Hide resolved
charts/kube-otel-stack/values.yaml Outdated Show resolved Hide resolved
charts/otel-cloud-stack/values.yaml Outdated Show resolved Hide resolved
Copy link
Contributor

@gdfast gdfast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! I think for testing

  1. it would be good to see the Helm chart rendered into full k8s config
  2. see if this chart can be used to standup (replace) the dev-opamp collectors and bridge

@jdcrouse jdcrouse marked this pull request as ready for review March 14, 2024 16:02
@gdfast gdfast requested a review from a team March 15, 2024 14:53
Copy link
Contributor

@jaronoff97 jaronoff97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Assume the comments for KOS are the same for the OCS)

charts/kube-otel-stack/templates/bridge.yaml Outdated Show resolved Hide resolved
charts/kube-otel-stack/templates/bridge.yaml Outdated Show resolved Hide resolved
charts/kube-otel-stack/templates/collector.yaml Outdated Show resolved Hide resolved
charts/kube-otel-stack/values.yaml Outdated Show resolved Hide resolved
@jdcrouse jdcrouse requested a review from jaronoff97 March 15, 2024 16:54
…e same, so i will finalize otel-cloud-stack with this PR an then just apply the same changes to kube-otel-stack in followup
charts/pbcopy Outdated Show resolved Hide resolved
@jdcrouse jdcrouse requested review from jaronoff97 and a team March 15, 2024 19:59
@jdcrouse jdcrouse requested a review from jaronoff97 March 20, 2024 14:24
Copy link
Contributor

@jaronoff97 jaronoff97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great stuff here. Thank you!

@jaronoff97 jaronoff97 merged commit 5deb1eb into main Mar 20, 2024
1 check passed
@jaronoff97 jaronoff97 deleted the jason/LS-59631/add-opamp-bridge-otel-cloud-stack branch March 20, 2024 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants