Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error generating accessibility requirements #528

Open
ianb-mp opened this issue Apr 19, 2024 · 4 comments
Open

error generating accessibility requirements #528

ianb-mp opened this issue Apr 19, 2024 · 4 comments
Labels
invalid This doesn't seem right

Comments

@ianb-mp
Copy link

ianb-mp commented Apr 19, 2024

What steps did you take and what happened:

After k8s node reboot, when I create a new PVC using zfs-localpv storageclass the PVC creation fails with error:

error generating accessibility requirements: topologyKeys [...] were not found on any nodes

(Full error message here)

A temporary fix is to restart the openebs-zfs-localpv-node daemonset however when I reboot the k8s node the error returns.

What did you expect to happen:

I assume this isn't expected behaviour, so it would be good if this could be resolved without requiring manual intervention.

The output of the following commands will help us better understand what's going on:

  • kubectl logs -f openebs-zfs-controller-[xxxx] -n openebs -c openebs-zfs-plugin see gist
  • kubectl logs -f openebs-zfs-node-[xxxx] -n openebs -c openebs-zfs-plugin see gist
  • kubectl get pods -n openebs
NAME                                              READY   STATUS    RESTARTS       AGE
openebs-localpv-provisioner-6b8bff68bd-prpfv      1/1     Running   1 (14m ago)    90m
openebs-lvm-localpv-controller-59cd69bd56-hfjmn   5/5     Running   10 (14m ago)   2d4h
openebs-lvm-localpv-node-wb2mb                    2/2     Running   4 (14m ago)    2d4h
openebs-zfs-localpv-controller-6f879b95b-jbx8p    5/5     Running   10 (14m ago)   2d4h
openebs-zfs-localpv-node-44f9n                    2/2     Running   3 (13m ago)    30m
  • kubectl get zv -A -o yaml see gist

Anything else you would like to add:

zfs-localpv was installed via Openebs helm chart v4.0.0:

helm repo add openebs https://openebs.github.io/openebs
helm repo update
helm install openebs --namespace openebs openebs/openebs --create-namespace

[Miscellaneous information that will assist in solving the issue.]

Environment:

  • LocalPV-ZFS version: v2.5.1
  • Kubernetes version (use kubectl version):
Client Version: v1.29.3+rke2r1
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.3+rke2r1
  • Kubernetes installer & version:
  • Cloud provider or hardware configuration: bare metal, x86_64
  • OS (e.g. from /etc/os-release): Rocky Linux 9.3
@w3aman
Copy link
Contributor

w3aman commented Apr 19, 2024

Restarting the zfs-node daemonset is must, so that driver can pick up the required topologies if they are added after driver is installed. One way is that while installing we can set topologies so that later on for those keys we don't need to restart node-agents. see here -- https://github.com/openebs/zfs-localpv/blob/develop/docs/faq.md#6-how-to-add-custom-topology-key

For rebooting the node, it should not be the behaviour and I myself has come across rebooting the node scenario's. Volume provisioning worked fine for me. Can you please share your storage class yaml and kubectl get csinode <node-name> -o yaml and verify that same key which is used in storage class is present on the node

@ianb-mp
Copy link
Author

ianb-mp commented Apr 19, 2024

Can you please share your storage class yaml

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  creationTimestamp: "2024-04-16T22:33:59Z"
  name: openebs-zfspv
  resourceVersion: "1294905"
  uid: e45d8212-7b24-481c-88ea-194ee5a27f21
parameters:
  compression: "off"
  dedup: "off"
  fstype: zfs
  poolname: zfspv-pool
  recordsize: 128k
provisioner: zfs.csi.openebs.io
reclaimPolicy: Delete
volumeBindingMode: Immediate

kubectl get csinode -o yaml

See gist here

and verify that same key which is used in storage class is present on the node

Do you mean compare CSINode with node labels? Node labels are here

I restarted openebs-zfs-localpv-node daemonset then compared CSINode topologyKeys after restart with the values before restart and can see only this value has changed:

<       - scheduling.node.kubevirt.io/tsc-frequency-2200000000
---
>       - scheduling.node.kubevirt.io/tsc-frequency-2199997000

@ianb-mp
Copy link
Author

ianb-mp commented Apr 22, 2024

I understand the issue is caused by the topology keys in CSINode not matching the node labels after a reboot. This can be fixed by updating the openebs-zfs-localpv-node daemonset and setting ALLOWED_TOPOLOGIES from all to a list of specific labels that I know won't change on reboot.

It seems odd that Kubevirt uses a node label with a dynamic key value, but I guess there must be a good reason.

I appreciate the need for the topology keys to match the node etc, but I wonder if there is a better default approach for zfs-localpv that will permit it to work with Kubevirt (and operators like it) 'out of the box' (i.e. without needing special configuration)?

@sinhaashish
Copy link
Member

hi @ianb-mp
As @w3aman said you can use this doc https://github.com/openebs/zfs-localpv/blob/develop/docs/faq.md#6-how-to-add-custom-topology-key to add the custom label.

As i can understand in your case the kubervirt has dynamic label which changes upon node restart. The new label need to be updated in the daemonSet directly by editing it and then restarting.
The default is specified as all, which will take the node label keys as allowed topologies. This you can change it by editing the daemonset.
Now in your case the node labels change upon restart, which should not have happened.
And if the node label changes then you need to update the daemon set and restart it to reflect the change

@sinhaashish sinhaashish added the invalid This doesn't seem right label Jun 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid This doesn't seem right
Projects
None yet
Development

No branches or pull requests

3 participants