Gluster 11 NVME VM disk corruption. #4484

gilbertoferreira · 2025-02-22T13:22:54Z

Hi there.

I'd like to know if there are any issues with GlusterFS and NVME.
This week I got two customer where I build 2 Proxmox VE with GlusterFS 11.
I had have created with:
On both nodes I do:

mkdir /data1
mkdir /data2
mkfs.xfs /dev/nvme1
mkfs.xfs /dev/nvme2
mount /dev/nvme1 /data1
mount /dev/nvme2 /data2
I had have installed gluster like this:
wget -qO - https://download.gluster.org/pub/gluster/glusterfs/11/rsa.pub | gpg --dearmor -o /etc/apt/trusted.gpg.d/gluster.gpg

echo "deb https://download.gluster.org/pub/gluster/glusterfs/LATEST/Debian/bookworm/amd64/apt bookworm main" > /etc/apt/sources.list.d/gluster.list

After install glusterfs and do the peer probe, I do

gluster vol create VMS replica 2 gluster1:/data1/vms gluster2:/data1/vms gluster1:/data2/vms gluster2:/data/vms

To solve the split-brain issue, I applied this configurations:
gluster vol set VMS cluster.heal-timeout 5
gluster vol heal VMS enable
gluster vol set VMS cluster.quorum-reads false
gluster vol set VMS cluster.quorum-count 1
gluster vol set VMS network.ping-timeout 2
gluster vol set VMS cluster.favorite-child-policy mtime
gluster vol heal VMS granular-entry-heal enable
gluster vol set VMS cluster.data-self-heal-algorithm full
gluster vol set VMS features.shard on
gluster vol set VMS performance.write-behind off
gluster vol set VMS performance.flush-behind off

So this configuration allows me to power down the first server and the VMs restart on the secondary server, with no issues at all.
I have the very same scenario in another customer, but there we are working wih SSD DC600M Kingston.

Turns out that in the servers with NVME I got a lot of disk corruption inside the VM.
If I reboot, things go worse.

Does anybody know any cases about gluster and nvme issues like that?
Is there any fix for that?

Thanks

pranithk · 2025-02-28T06:14:11Z

Replica 2 volume is not good for consistency. It even warns while creating. I wouldn't use it for production workload.
Also these are the recommended options for VM workloads

pranith.karampuri@PP-40RQS54 - ~/workspace/glusterfs (monotonic-clock-janitor)
11:33:42 :) ⚡ cat extras/group-virt.example 
performance.quick-read=off
performance.read-ahead=off
performance.io-cache=off
performance.low-prio-threads=32
network.remote-dio=disable
performance.strict-o-direct=on
cluster.eager-lock=enable
cluster.quorum-type=auto
cluster.server-quorum-type=server
cluster.data-self-heal-algorithm=full
cluster.locking-scheme=granular
cluster.shd-max-threads=8
cluster.shd-wait-qlength=10000
features.shard=on
user.cifs=off
cluster.choose-local=off
client.event-threads=4
server.event-threads=4
performance.client-io-threads=on
network.ping-timeout=20
server.tcp-user-timeout=20
server.keepalive-time=10
server.keepalive-interval=2
server.keepalive-count=5
cluster.lookup-optimize=off

gilbertoferreira · 2025-02-28T11:41:19Z

Yes. I know replica 2 is not recommended.
But I have used this set up for couple of years and nothing really happens so worst than what's happen now.
And just with NVME. The VM was seating there for months and then, all of the sudden, crashes.
With SSD everything is ok.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gluster 11 NVME VM disk corruption. #4484

Gluster 11 NVME VM disk corruption. #4484

gilbertoferreira commented Feb 22, 2025 •

edited

Loading

pranithk commented Feb 28, 2025

gilbertoferreira commented Feb 28, 2025

Gluster 11 NVME VM disk corruption. #4484

Gluster 11 NVME VM disk corruption. #4484

Comments

gilbertoferreira commented Feb 22, 2025 • edited Loading

pranithk commented Feb 28, 2025

gilbertoferreira commented Feb 28, 2025

gilbertoferreira commented Feb 22, 2025 •

edited

Loading