Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contention in io_buffer_select when IOSQE_BUFFER_SELECT and IOSQE_ASYNC are used together #669

Open
jrudolph opened this issue Sep 29, 2022 · 2 comments

Comments

@jrudolph
Copy link
Contributor

I'm trying out io_uring and am testing different ways of submitting requests. My test is a simple webserver-like application that accepts multiple sockets, and then alternatively reads and writes each socket. Everything is running on a single application thread within a single ring.

In general, using IOSQE_ASYNC does not seem to make too much sense for network reads because it often does strictly more work than using the default path. On the other hand, for a single threaded server, much CPU time will be spent inside of the kernel TCP stack, so using IOSQE_ASYNC could help by freeing the application thread for other work while the kernel threads do all the heavy lifting.

Looking into the performance with Linux 5.19.11 I noticed that the flamegraph shows lots of time spent in allocating buffers from the provided buffers:

image

Zooming in on io_read:

image

This is with max 128 concurrent reads. It seems in that scenario the amount of concurrent wqe_workers gets quite high (maybe even 1 per requests?), so if there's a mutex in the buffer selection path that cannot work well if many or all of the sockets are readable at the same time.

Is this contention expected and should be documented?

@axboe
Copy link
Owner

axboe commented Sep 29, 2022

I would not recommend using provided buffers with IOSQE_ASYNC, as you have noticed they need to serialize with the ring mutex. This is generally not a concern, but it does certainly become one if you have a lot of io-wq activity due to marking the SQEs async. You'll be better off setting aside some threads in userspace, each with a ring, and using provided buffers with those.

In general, IOSQE_ASYNC isn't very efficient and should be avoided for most use cases.

@jrudolph
Copy link
Contributor Author

Thanks for the quick answer. I agree, there are good alternatives.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants