io_submit blocks the reactor when device's request queue fills up #70

tgrabiec · 2015-10-21T09:35:48Z

On systems with slow disks it's possible that block device queue will fill up and io_submit will block inside get_request blocking the reactor thread. This manifests itself with high iowait time.

It can be remedied by increasing /sys/block/$DEV/queue/nr_requests to match concurrency level but this has a downside of increasing request latency while not resulting in improved disk utilization. Better would be to avoid overflowing the queue on seastar level by applying back-pressure.

The text was updated successfully, but these errors were encountered:

tgrabiec · 2016-04-06T14:31:14Z

@glommer Is this issue fixed completely by ioqueues?

glommer · 2016-04-06T14:40:38Z

Theoretically yes, since we now control how many requests are in flight. It
can probably still happen if we run with a high iodepth but I don't think
this case is worth fixing
On Apr 6, 2016 10:31 AM, "Tomasz Grabiec" [email protected] wrote:

@glommer https://github.com/glommer Is this issue fixed completely by
ioqueues?

—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#70 (comment)

backport fix for multi-reactor

travisdowns · 2024-07-15T14:30:51Z

Is this issue fixed completely by ioqueues?

Pretty sure the answer is "no". At least in the case where the disk is not performing worse than the io-properties suggest (this seems relatively common at least in short bursts for both local and network attached disks), the IO scheduler will still let many requests into the disk and we can get a high concurrency and can exceed nr_requests, causing the reactor to block.

I'm not sure if this was better before the changes in #1766, since in principle the back link responded very quickly before, whereas now it takes a while to respond and by that time you have already hit nr_requests.

Since we know that hitting nr_requests is just a death sentence for the reactor, maybe we should have another hard cap just below that value such that the IO scheduler doesn't exceed this value? It does sound a lot like the old 2 bucket system (though it works in units of "requests" not the cost units the rest of the scheduler deals in) though I don't know if it could somehow be done more simply than that.

travisdowns · 2024-07-17T17:35:02Z

Using io_uring may or may not solve this, related question: axboe/liburing#1184

avikivity · 2024-07-17T19:08:09Z

Do you actually hit nr_request limits? I've seen it with spinning disks, but not SSD/NVMe.

travisdowns · 2024-07-17T20:35:24Z

Do you actually hit nr_request limits? I've seen it with spinning disks, but not SSD/NVMe.

Yes, but because the disk (EBS in this case, but it also happens with local SSDs) suffers a temporary slowdown, e.g., dropping to 1% of it's usual throughput for a few 100ms. During this hiccup we will quickly exceed nr_requests (63 per device) since we are ~3000 IO/s per second. These are background writes so this would be OK (i.e., the world wouldn't stop) except that due to the reactor stall the world does stop.

travisdowns · 2024-07-17T20:38:23Z

In normal operation, there are only a "few" IOs so we don't hit nr_requests.

I think this is one of the flaws in the current "feed-forward rate limiting" scheduler (as opposed to a "concurrency" scheduler): it does not do well when the characteristics of the disk change: you need to set the IO properties to an appreciable fraction of the true "nominal" disk performance, or else you leave a lot of IO on the table, but then if the disk is a bit slower for any reason the number of queued IOs quickly grows and the current feedback mechanism isn't quick enough to catch it.

BenPope pushed a commit to BenPope/seastar that referenced this issue Aug 31, 2023

Merge pull request scylladb#70 from redpanda-data/thread-local

6f66c48

backport fix for multi-reactor

travisdowns mentioned this issue Jul 25, 2024

Change pending_io container to chunked_fifo #2357

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

io_submit blocks the reactor when device's request queue fills up #70

io_submit blocks the reactor when device's request queue fills up #70

tgrabiec commented Oct 21, 2015

tgrabiec commented Apr 6, 2016

glommer commented Apr 6, 2016

travisdowns commented Jul 15, 2024

travisdowns commented Jul 17, 2024

avikivity commented Jul 17, 2024

travisdowns commented Jul 17, 2024

travisdowns commented Jul 17, 2024

io_submit blocks the reactor when device's request queue fills up #70

io_submit blocks the reactor when device's request queue fills up #70

Comments

tgrabiec commented Oct 21, 2015

tgrabiec commented Apr 6, 2016

glommer commented Apr 6, 2016

travisdowns commented Jul 15, 2024

travisdowns commented Jul 17, 2024

avikivity commented Jul 17, 2024

travisdowns commented Jul 17, 2024

travisdowns commented Jul 17, 2024