Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade Issue with Kubernetes Fury Distribution (KFD) #354

Open
syamprasadchitluri opened this issue Feb 10, 2025 · 3 comments
Open

Upgrade Issue with Kubernetes Fury Distribution (KFD) #354

syamprasadchitluri opened this issue Feb 10, 2025 · 3 comments
Assignees
Labels
question Further information is requested

Comments

@syamprasadchitluri
Copy link

I am trying to upgrade the KFD of OnPremises provider to version 1.29.4, but I'm encountering the following issues:

Error 1: Unable to upgrade from v1.29.1 to v1.29.4

ERRO error while creating cluster: error while executing preupgrade phase: upgrade path not found: unable to upgrade from v1.29.1 to v1.29.4, please check the available upgrade paths with the command 'furyctl get upgrade-paths'

I have checked the available upgrade paths using the furyctl get upgrade-paths command, and I have also validated the configuration using the furyctl validate config command.

My current Kubernetes version is 1.29.3, which I have verified using the kubectl version command. However, the error message is referencing version 1.29.1, which I'm not sure where it's coming from.

After clearing some cache from the .furyctl folder, I was able to run the upgrade using the command furyctl apply --upgrade --skip-nodes-upgrade --skip-deps-download --skip-deps-validation, but I'm now encountering the following errors during the "applying manifests" stage:

Error 2: Conflicts with "Go-http-client" resource

Apply failed with 2 conflicts: conflicts with "Go-http-client" using rbac.authorization.k8s.io/v1:
- .rules
- .metadata.labels.app.kubernetes.io/version
Please review the fields above--they currently have other managers. Here
are the ways you can resolve this warning:
* If you intend to manage all of these fields, please re-run the apply
  command with the `--force-conflicts` flag.
* If you do not intend to manage all of the fields, please edit your
  manifest to remove references to the fields that should keep their
  current managers.
* You may co-own fields by updating your manifest to match the existing
  value; in this case, you'll become the manager if the other manager(s)
  stop managing the field (remove it from their configuration).
See https://kubernetes.io/docs/reference/using-api/server-side-apply/#conflicts

Apply failed with 2 conflicts: conflicts with "Go-http-client" using rbac.authorization.k8s.io/v1:
- .subjects
- .metadata.labels.app.kubernetes.io/version
Please review the fields above--they currently have other managers. Here
are the ways you can resolve this warning:
* If you intend to manage all of these fields, please re-run the apply
  command with the `--force-conflicts` flag.
* If you do not intend to manage all of the fields, please edit your
  manifest to remove references to the fields that should keep their
  current managers.
* You may co-own fields by updating your manifest to match the existing
  value; in this case, you'll become the manager if the other manager(s)
  stop managing the field (remove it from their configuration).
See https://kubernetes.io/docs/reference/using-api/server-side-apply/#conflicts

I checked for the resource Go-http-client in the cluster, I couldn't find in any namespace. I tried adding the flag --force-conflicts to the apply.sh script in .furyctl folder, but it is getting overwritten every time I run the upgrade command.

Here are the outputs of the commands furyctl version and kubectl version

buildTime: 2024-12-24T09:16:07Z
gitCommit: 644cfb0e7f2b05048d010639d605cb1dfb57083e
goVersion: go1.23.2
osArch: amd64
version: 0.31.0

Client Version: v1.32.1
Kustomize Version: v5.5.0
Server Version: v1.29.3

I would appreciate if you could provide guidance on how to resolve these issues and successfully upgrade the KFD to version 1.29.4.

@ralgozino
Copy link
Member

Hello @syamprasadchitluri

I'm sorry you are having issues with KFD, let's see if we can figure out what is going on.

Error 1

I think you might be confusing KFD version and Kubernetes version, they are not the same. From what you shared I guess that your cluster was on KFD 1.29.1 that installs Kubernetes 1.29.3

You can read more on the versioning in the docs: https://docs.kubernetesfury.com/docs/core-concepts/versioning

So, you should have upgraded to one of the supported upgrade paths from 1.29.1:

❯ furyctl get upgrade-paths --from 1.29.1 --kind OnPremises
INFO Available upgrade paths for version 1.29.1 of kind OnPremises are: 1.29.2

That is, KFD 1.29.2 (that still uses Kubernetes 1.29.3)

Can you please confirm which versions where you using in the furyctl.yaml file before and after upgrade?

Error 2

For this error what is happening is that something (the "Go-http-client") has modified resources in the cluster that are managed by furyctl, so when furyctl tries to server-side apply changes to them it finds a conflict and stops.

Unfortunately, the error we get from the API server that we pass to the user (you in this case) is not very clear on which are the resources that have conflicts.

From the information you shared it seems that there is on resource where the following fields have been modified by "Go-http-client":

  • .rules
  • .metadata.labels.app.kubernetes.io/version

And another resource where the following fields have been modified:

  • .subjects
  • .metadata.labels.app.kubernetes.io/version

Seems like some RBAC definition has been changed.

If you can share the logs from the failed applied and the furyctl.yaml file we can help you identify which resources are the ones failing.

Unfortunately, furyctl does not allow to pass the --force-conflicts flag to the apply command and as you found out the apply script gets replaced from the one included in the version of KFD. You could overwrite it with the --distro-location or -distro-patch flags, but using them is not encouraged. These features are more for development uses, I'd leave them as a last resource.

Starting from KFD 1.31.0 we changed the way we apply the distribution phase and this issue should not be present any more.

@ralgozino ralgozino self-assigned this Feb 11, 2025
@ralgozino ralgozino added the question Further information is requested label Feb 11, 2025
@syamprasadchitluri
Copy link
Author

Hello @ralgozino,

Thank you for the response and helpful information.

Regarding the first error, is there a way to check the KFD version that is installed in the cluster?

I tried upgrading the KFD to 1.29.2 based on your recommendation, but I am still encountering the second error.

For the second error, I have provided the furyctl.yaml file and the logs from the failed apply operation, for identifying the failing resources.

# Use of this source code is governed by a BSD-style
# license that can be found in the LICENSE file.

---
apiVersion: kfd.sighup.io/v1alpha2
kind: OnPremises
metadata:
  name: getting-started
spec:
  distributionVersion: v1.29.2
  kubernetes:
    pkiFolder: ./pki
    ssh:
      username: root
      keyPath: ~/.ssh/id_rsa
    dnsZone: example.tld
    controlPlaneAddress: control-plane.example.tld:6443
    podCidr: 172.16.128.0/17
    svcCidr: 172.16.0.0/17
    loadBalancers:
      enabled: true
      hosts:
        - name: haproxy1
          ip: 10.x.x.x
      keepalived:
        enabled: true
        interface: ens192
        ip: 10.x.x.x
        virtualRouterId: "201"
        passphrase: "b16cf069"
      stats:
        username: admin
        password: password
      additionalConfig: "{file://./haproxy-additional.cfg}"
    masters:
      hosts:
        - name: cp01
          ip: 10.x.x.x
        - name: cp02
          ip: 10.x.x.x
        - name: cp03
          ip: 10.x.x.x
    nodes:
      - name: worker
        hosts:
          - name: wk01
            ip: 10.x.x.x
          - name: wk02
            ip: 10.x.x.x
  distribution:
    common: {}
    modules:
      networking:
        type: cilium
      # This section contains all the configurations for the ingress module
      ingress:
        baseDomain: fury.example.tld
        nginx:
          type: single
          tls:
            provider: secret
            secret:
              cert: "{file://./pki/tls/tls.crt}"
              key: "{file://./pki/tls/tls.key}"
              ca: "{file://./pki/tls/ca.crt}"
        certManager:
          clusterIssuer:
            name: letsencrypt-fury
            email: [email protected]
            type: http01
      logging:
        type: loki
        minio:
          storageSize: "10Gi"
      monitoring:
        type: prometheus
      policy:
        type: none
      dr:
        type: on-premises
        velero: {}
      auth:
        provider:
          type: none 
        
  plugins:
    kustomize: 
      - name: cert-manager-secret
        folder: ./cert-manager-secret/
      - name: storage
        folder: https://github.com/rancher/local-path-provisioner/deploy?ref=v0.0.24
    helm:
        repositories:
          - name: nfs-subdir-external-provisioner
            url: https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner
          - name: portainer
            url: https://portainer.github.io/k8s
        releases:
          - name: nfs-subdir-external-provisioner
            namespace: nfs-subdir-external-provisioner
            chart: nfs-subdir-external-provisioner/nfs-subdir-external-provisioner
            set:
              - name: nfs.server
                value: ""
              - name: nfs.path
                value: ""
          - name: portainer
            namespace: portainer
            chart: portainer/portainer
            set:
              - name: image.repository
                value: portainer/portainer-ee
              - name: service.type
                value: ClusterIP

Please let me know if you need any additional information from me. I appreciate you taking the time to investigate this issue and provide guidance.

@ralgozino
Copy link
Member

Hi @syamprasadchitluri

Thanks for the info.

Your cluster is at KFD version 1.29.1, you can see it in the confirmation prompt:

WARN Distribution version changed from v1.29.1 to v1.29.2, you are about to upgrade the cluster.
Are you sure you want to continue? Only 'yes' will be accepted to confirm.
yes

Anyway, you can confirm by running the following command:

kubectl get secrets -n kube-system furyctl-config -ojsonpath='{.data.config}' | base64 -d | grep distributionVersion

You will keep facing the error until you solve the conflict. From the logs that you shared, I cannot point to the specific resource, but I can tell you that some tool that uses the "Go-http-client" (maybe some operator?) has modified some RBAC configuration, probably the clusterrolebinding for kube-sate-metrics, but I'm just guessing here.

Please do the following:

cd /home/user/.furyctl/getting-started/distribution/manifests
../../../bin/kustomize/3.10.0/kustomize build | kubectl diff -f -

This should throw the differences between what KFD expects and what is in the cluster. Search for differences in rbac.authorization.k8s.io objects and delete those objects. Afterwards try upgrading again.

Let me know how it goes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants