Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[kube-prometheus-stack] CRDs version issue for 67.9.0 #5127

Closed
matphilippe opened this issue Jan 10, 2025 · 15 comments · Fixed by #5220
Closed

[kube-prometheus-stack] CRDs version issue for 67.9.0 #5127

matphilippe opened this issue Jan 10, 2025 · 15 comments · Fixed by #5220
Labels
bug Something isn't working

Comments

@matphilippe
Copy link
Contributor

matphilippe commented Jan 10, 2025

Describe the bug a clear and concise description of what the bug is.

We ran into troubles upgrading our Kube-Prometheus-Stack charts from 66.7.1 to 67.9.0.

After a few hiccups related to Prometheus 3.x, we noticed that some of our workloads weren't scraped anymore.
We could trace this to a wrong version of the CRDs: bumping them from v0.79.0 to v0.79.2 restored functionality.

The current Readme, does ask to set CRDs to v0.79.0.

What's your helm version?

3.16.5

What's your kubectl version?

1.30.5

Which chart?

kube-prometheus-chart

What's the chart version?

67.9.0

What happened?

We ran into troubles upgrading our Kube-Prometheus-Stack charts from 66.7.1 to 67.9.0.

For the update,

  • we applied the steps in the current Readme, does ask to set CRDs to v0.79.0;
  • we fixed prometheus 3.0.0 migration issues (metric related)

We then noticed that some of our worloads weren't scraped anymore.

Upon futher inspections, we aligned the CRDs versions to the chart's appVersion: v0.79.2
This restored functionality

What you expected to happen?

When we run the upgrade steps as described in the readme, we expect the system to continue working as before (modulo adaptations as described in upgrade guides).

In particular, we didn't expect to further bump the CRDs

How to reproduce it?

Run a KPS stack with 67.9.0 chart against a cluster with CRDs for v0.79.0.

Enter the changed values of values.yaml?

None

Enter the command that you execute and failing/misfunctioning.

None

Anything else we need to know?

We do rely on renovate for version updates, and deliver those to our clusters through argocd.

Ideally, we'd get a simple set of instructions as to how to get the right CRDs for any version of the chart, which we can then use for automation purposes.

Can we safely assume that picking the chart's appVersion as version for the CRDs is the right thing to do?
If so, can this be documented and enforced?

@matphilippe matphilippe added the bug Something isn't working label Jan 10, 2025
@jkroepke
Copy link
Member

Hi, I looked into the change https://github.com/prometheus-community/helm-charts/pull/5075/files and saw, there is no change between CRD v0.79.0 and v0.79.2. Only the version number on the annotation has been changed, which has no affect. In other words, the CRD v0.79.0 and v0.79.2 are equal.

@rvalkenaers
Copy link

Hi @jkroepke
The changes are introduced in v0.79.1: https://github.com/prometheus-community/helm-charts/pull/5072/files

@jkroepke
Copy link
Member

@QuentinBisson FYI - if there are real changes on the CRD, the major version should be increased as well.

@QuentinBisson
Copy link
Member

Oh my bad you're right, I'm sorry about this. Thanks for letting me know

@matphilippe
Copy link
Contributor Author

Hi @QuentinBisson @jkroepke ,

Thanks for your attention :)

Would you have any thoughts to share about the following?

We do rely on renovate for version updates, and deliver those to our clusters through argocd.

Ideally, we'd get a simple set of instructions as to how to get the right CRDs for any version of the chart, which we can then use for automation purposes.

Can we safely assume that picking the chart's appVersion as version for the CRDs is the right thing to do?
If so, can this be documented and enforced?

If we could automate things, that'd be awesome :)

@QuentinBisson
Copy link
Member

Maybe @jkroepke PR could prevent this.
I would expect he appVersion to be the prometheus-operator version so the crds should be aligned yes but I'm not 100% sure

@jkroepke
Copy link
Member

@QuentinBisson

CRDs are matching. The issue was that the changes within the CRDs was wart of a minor update and there was no documentation about updating the CRDs.

Normally, if end-users have to manually bump CRDs, we mark this as major changes and leave a note on the README.

@QuentinBisson
Copy link
Member

Thanks for clarifying this :)

@matphilippe
Copy link
Contributor Author

Nice PR :)

FYI, I know its unasked for, but I'd love to share two things we learned from our usage.
It'd be awesome to have the CRDs packaged in a CRD only chart (no need, for us, to include them to the main chart, or have a job for the update).

Context

We do:

  • Helm charts into components git repository.
  • Templated manifests into gitops repository.
  • Pulled into clusters by argoCD applications.

Lesson 1: ArgoCD needs special configuration to play with CRDs.

A pattern that works is to configure dedicated ArgoCD applications to manage CRDs (e.g. with ServerSideApply),
and leave usual configuration to some more vanilla applications.

Lesson 2: CRD-only charts are good to have.

For example, Karpenter maintains a CRD chart: Karpenter-CRD, with parallel versioning.

It's easy to manage to parallel flow (one for the CRD, one for the rest).

As a matter of fact, we do have a similar workflow for KPS's CRDs, where the dedicated chart is replaced by a script running the Readme's curls.

@jkroepke
Copy link
Member

jkroepke commented Jan 20, 2025

Instead propose a crd only chart, why you are not just using it? https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus-operator-crds

@matphilippe
Copy link
Contributor Author

Instead propose a crd only chart,

... with parallel versioning with KPS, this wasn't clear, my bad

why you are not just using it?

Can we safely assume that picking the chart's appVersion as version for the CRDs is the right thing to do?
If so, can this be documented and enforced?

Because we don't know if that's the way to go

@jkroepke
Copy link
Member

There are CI checks that enforce this behavior. If appVersion is bump, the CI pull the CRD from upstream and runs a git diff. If the CRD are not updated inside kps, the CI fail.

however, helm itself has no native capability to upgrade CRDs. If CRDs are updated, helm will not update the CRDs on existing installations by default.

This is, why we are using major version bump to get the end-user awareness, because the end-user has todo manual step.

However Flux and ArgoCD resolve this issue and there are not affected.

@matphilippe
Copy link
Contributor Author

There are CI checks that enforce this behavior. If appVersion is bump, the CI pull the CRD from upstream and runs a git diff. If the CRD are not updated inside kps, the CI fail.

That's very useful, thank you.
WDYT if I opened a PR documenting this in the Upgrade section of the Readme ?
Is there a better place ?

@jkroepke
Copy link
Member

README is good + a link to the readme at the appVersion property in Chart.yaml

matphilippe pushed a commit to matphilippe/helm-charts that referenced this issue Jan 20, 2025
matphilippe pushed a commit to matphilippe/helm-charts that referenced this issue Jan 20, 2025
@matphilippe
Copy link
Contributor Author

The PR is up, thanks again for your time

matphilippe pushed a commit to matphilippe/helm-charts that referenced this issue Jan 20, 2025
matphilippe pushed a commit to matphilippe/helm-charts that referenced this issue Jan 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
4 participants