Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snowball Edge EC2s: block device mapping not consistent with driver #2294

Open
t-quick opened this issue Jan 21, 2025 · 1 comment
Open

Snowball Edge EC2s: block device mapping not consistent with driver #2294

t-quick opened this issue Jan 21, 2025 · 1 comment
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@t-quick
Copy link

t-quick commented Jan 21, 2025

/kind bug

On Snowball Edge EC2's, after a cluster has been up for a number of hours or days and pods have been moved around, the ebs-csi driver cannot create new pvcs on the node. We get messages like below. Workaround is to cordon/drain node and reboot. The block devices are getting created on the EC2's, but they exist as a different device than what the driver is looking for, e.g. in the below example, /dev/vdi was created on the EC2 and was the appropriate size.

Warning FailedMount 15s (x7 over 47s) kubelet MountVolume.MountDevice failed for volume "pvc-dcb8d5a7-2c7b-4d56-9ad5-38af06a8c38e" : rpc error: code = Internal desc = could not format "/dev/vdg" and mount it at "/var/lib/kubelet/plugins/kubernetes.io/csi/ebs.csi.aws.com/78bad98f12b3707d05490eafe2f1515e9e0c5eb0fa49800d184f11afd6899b93/globalmount": format of disk "/dev/vdg" failed: type:("ext4") target:("/var/lib/kubelet/plugins/kubernetes.io/csi/ebs.csi.aws.com/78bad98f12b3707d05490eafe2f1515e9e0c5eb0fa49800d184f11afd6899b93/globalmount") options:("defaults") errcode:(exit status 1) output:(mke2fs 1.46.5 (30-Dec-2021)
The file /dev/vdg does not exist and no size was specified.

Expected: driver configures storage properly

Additional info:
This is the EC2 instance metadata that is available.

https://docs.aws.amazon.com/snowball/latest/developer-guide/edge-compute-instance-metadata.html

There is no block device info available here. Looking at the driver readme, it looks like that may be necessary, and using kubernetes metadata does not provide this either. This may be life on Snowball Edges - writing this to see which tree is the right one to bark up. Also to document for other Snowball Edge users experiencing similar issues

Environment

  • Kubernetes version (use kubectl version): v1.31.3
  • Driver version: 2.36.0
@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Jan 21, 2025
@AndrewSirenko
Copy link
Contributor

Hi @t-quick, thank you for filing this Snowball Edge + EBS CSI Driver issue. We especially appreciate you attaching a workaround for other impacted users!

I will try to find an internal expert on Snow devices to help troubleshoot this issue.

In the meantime, I would encourage you to also open an official AWS support case referencing this GitHub Issue. That would help us bark up more trees. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

3 participants