Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gluster 11 NVME VM disk corruption. #4484

Open
gilbertoferreira opened this issue Feb 22, 2025 · 2 comments
Open

Gluster 11 NVME VM disk corruption. #4484

gilbertoferreira opened this issue Feb 22, 2025 · 2 comments

Comments

@gilbertoferreira
Copy link

gilbertoferreira commented Feb 22, 2025

Hi there.

I'd like to know if there are any issues with GlusterFS and NVME.
This week I got two customer where I build 2 Proxmox VE with GlusterFS 11.
I had have created with:
On both nodes I do:

mkdir /data1
mkdir /data2
mkfs.xfs /dev/nvme1
mkfs.xfs /dev/nvme2
mount /dev/nvme1 /data1
mount /dev/nvme2 /data2
I had have installed gluster like this:
wget -qO - https://download.gluster.org/pub/gluster/glusterfs/11/rsa.pub | gpg --dearmor -o /etc/apt/trusted.gpg.d/gluster.gpg

echo "deb https://download.gluster.org/pub/gluster/glusterfs/LATEST/Debian/bookworm/amd64/apt bookworm main" > /etc/apt/sources.list.d/gluster.list

After install glusterfs and do the peer probe, I do

gluster vol create VMS replica 2 gluster1:/data1/vms gluster2:/data1/vms gluster1:/data2/vms gluster2:/data/vms

To solve the split-brain issue, I applied this configurations:
gluster vol set VMS cluster.heal-timeout 5
gluster vol heal VMS enable
gluster vol set VMS cluster.quorum-reads false
gluster vol set VMS cluster.quorum-count 1
gluster vol set VMS network.ping-timeout 2
gluster vol set VMS cluster.favorite-child-policy mtime
gluster vol heal VMS granular-entry-heal enable
gluster vol set VMS cluster.data-self-heal-algorithm full
gluster vol set VMS features.shard on
gluster vol set VMS performance.write-behind off
gluster vol set VMS performance.flush-behind off

So this configuration allows me to power down the first server and the VMs restart on the secondary server, with no issues at all.
I have the very same scenario in another customer, but there we are working wih SSD DC600M Kingston.

Turns out that in the servers with NVME I got a lot of disk corruption inside the VM.
If I reboot, things go worse.

Does anybody know any cases about gluster and nvme issues like that?
Is there any fix for that?

Thanks

@pranithk
Copy link
Member

Replica 2 volume is not good for consistency. It even warns while creating. I wouldn't use it for production workload.
Also these are the recommended options for VM workloads

pranith.karampuri@PP-40RQS54 - ~/workspace/glusterfs (monotonic-clock-janitor)
11:33:42 :) ⚡ cat extras/group-virt.example 
performance.quick-read=off
performance.read-ahead=off
performance.io-cache=off
performance.low-prio-threads=32
network.remote-dio=disable
performance.strict-o-direct=on
cluster.eager-lock=enable
cluster.quorum-type=auto
cluster.server-quorum-type=server
cluster.data-self-heal-algorithm=full
cluster.locking-scheme=granular
cluster.shd-max-threads=8
cluster.shd-wait-qlength=10000
features.shard=on
user.cifs=off
cluster.choose-local=off
client.event-threads=4
server.event-threads=4
performance.client-io-threads=on
network.ping-timeout=20
server.tcp-user-timeout=20
server.keepalive-time=10
server.keepalive-interval=2
server.keepalive-count=5
cluster.lookup-optimize=off

@gilbertoferreira
Copy link
Author

Yes. I know replica 2 is not recommended.
But I have used this set up for couple of years and nothing really happens so worst than what's happen now.
And just with NVME. The VM was seating there for months and then, all of the sudden, crashes.
With SSD everything is ok.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants