Replies: 4 comments 3 replies
-
First look on your ZSET read numbers scaling linearly with a number of fio processes says that iodepth parameter may not work as expected. Considering ZFS does not support asynchronous operation, I suspect that libaio just executes requests one at a time per process. I haven't looked on its code, but from our look on io_uring some time ago, I remember that the last relies on tight integration of file systems with page cache, and since ZFS does not use page cache, it is not in a good position here. The other question is whether your actual application will use libaio, or you are testing something you may not really care? If your workload will include many processes, then you may want to switch from libaio backend to psync and scale number of the processes to the planned one. Second, your tests show dramatically better performance of zvol vs file system. It can have only one explanation: the API you use to do the I/O (libaio) behaves differently for them, possibly because zvols decouple execution to a different threads, while libaio itself when running for file system doesn't. Generally performance of zvol should be identical to a performance of a single file on a file system, since that is what they are inside. Third, use of ext4 on top of zvol makes no sense, and the only reason it is faster on some of your tests that ZFS native is better integration to page cache and libaio. Otherwise it should be a total resource waste. And the last, you are saying that your target workload will include huge amount of small files, and same time you are testing performance of ONE zvol and ONE file. ZFS has a number of optimizations to scale-out performance when possible, while you are putting it in the most difficult situation of a single object of insanely small block size. You are testing not what you likely should. Test your real workload! |
Beta Was this translation helpful? Give feedback.
-
In addition to what @amotin mentioned:
|
Beta Was this translation helpful? Give feedback.
-
@amotin Thanks for sharing your insight. I appreciate the time. I've updated graphs with testing on 60 cores and added a FAQ section to answer why 4k, why libaio, why not real workloads etc.
No. libaoi just means that FIO simply maintains
I tested upto 60 cores & the levelling off is starting to show (see updated charts). I suspected this outcome because
Good point. This is perhaps a likely cause. At this point,
|
Beta Was this translation helpful? Give feedback.
-
Looks like dejavu to me. Very similar discussion at #16993 (comment) |
Beta Was this translation helpful? Give feedback.
-
Hello, I am evaluating ZFS for a series of projects each with varying storage requirements and seeing some surprising results. Hoping someone can confirm that these numbers make sense and/or help me tune it.
The data here is produced by a script & it has all the details on pool creation & the fio tests. The rationale is explained later but briefly, the following is data on a VDEV consisting of a single Micron 7450 PRO 7.68TB (1M 4k read IOPS), tested using fio with libaio, directio on zfs version 2.3.0. The different curves in each chart correspond to various files FIO ran tests on
100% 4k random reads
4k Seq Read
4k Seq Write
Random R/W with 90% reads
Questions
Appreciate any inputs!
Anticipated workload (one of many; this is the subject for this thread)
Goals for evalution
Test System
Test Bench
--numjobs={1,4,8,16,32}
to test scaling across CPUblkdiscard
the drive for each of the 20 combinations (numjobs x device) and test the 4 workloads in the order aboveblkdiscard
is to ensure that the drive's FTL is emptyPool, ZSet & ZVol parameters (see full script)
ZFS module parameters
FAQ
Question Why libaio fio testing and not test real world application usage?
At the moment, I am trying to understand the performance characteristics of ZFS and form a mental model. Something relatively simple like "it can do 100K IOPS/core with just basic checksumming and raidz2" is valuable to me. In Machine Learning, one avoids overfitting to the data at hand. Similarly, I try to avoid over-optimizing for my particular workload because the workloads will evolve.
In short, if I can't predict how ZFS will behave from IOPS & Throughput perspective, I won't use it.
Question EXT4 on ZFS is a waste. why you trying it?
In terms of filesystem features,
Zset > ZVol_Ext4 > Disk_Ext4
.ZVol_Ext4
offers more. It can do integrity. It can also do snapshottingZSet
offers as much asZVol_Ext4
but more performance and perhaps some more features.As I form a mental model of ZFS, I like to do basic sanity checks and understand
what am I getting for what cost?
. Here, ZFS burns CPU for integrity and additional features. The question is how much. Then, I can decide if the cost is worth it for my purpose.Direct IO
ZSet
is underperformingZVol_Ext4
for upto 15 cores in all patterns and upto 25 cores for read heavy workloads. I wouldn't have predicted that without this test. It's good to do sanity checks to confirm one's mental model.Beta Was this translation helpful? Give feedback.
All reactions