[RFC] Add `#[diagnostic::blocking]` attribute #3639

estebank · 2024-05-17T18:41:40Z

#[diagnostic::blocking] is a marker attribute for functions that are considered (by their author) to be a blocking operation, and as such shouldn't be invoked from an async function. rustc, clippy and other tools can use this signal to lint against calling these functions in async contexts.

It would allow rustc, clippy and other tools to provide lints like

warning: async function `foo` can block
  --> $DIR/blocking-calls-in-async.rs:28:1
   |
28 | async fn foo() {
   | --------------
29 |     interesting();
   |     ^^^^^^^^^^^^^`foo` is determined to block because it calls into `interesting`
   |
note: `interesting` is considered to be blocking because it was explicitly marked as such
  --> $DIR/blocking-calls-in-async.rs:5:1
   |
5  | #[diagnostic::blocking]
   | ^^^^^^^^^^^^^^^^^^^^^^^
6  | fn interesting() {}
   | ----------------

Rendered

`#[diagnostic::blocking]` is a marker attribute for functions that are considered (by their author) to be a blocking operation, and as such shouldn't be invoked from an `async` function. `rustc`, `clippy` and other tools can use this signal to lint against calling these functions in `async` contexts. It would allow `rustc`, `clippy` and other tools to provide lints like ``` warning: async function `foo` can block --> $DIR/blocking-calls-in-async.rs:28:1 | 28 | async fn foo() { | -------------- 29 | interesting(); | ^^^^^^^^^^^^^`foo` is determined to block because it calls into `interesting` | note: `interesting` is considered to be blocking because it was explicitly marked as such --> $DIR/blocking-calls-in-async.rs:5:1 | 5 | #[diagnostic::blocking] | ^^^^^^^^^^^^^^^^^^^^^^^ 6 | fn interesting() {} | ---------------- ```

compiler-errors · 2024-05-17T18:45:46Z

How does this interact with RFC #3014? Isn't must_not_suspend the same as diagnostic::blocking?

Would like to see that mentioned in the RFC -- if it's similar enough, then this RFC can just subsume that one.

estebank · 2024-05-17T18:51:36Z

The must_not_suspend lint seems tangentially related. That one is to ensure that a value doesn't go through a yield point, while this is so that we don't call std::thread::sleep in an async function.

davidbarsky · 2024-05-17T19:08:44Z

How does this interact with RFC #3014? Isn't must_not_suspend the same as diagnostic::blocking?

As Estaban said, they're kinda related. For example,tracing's Span::enter() returns a guard that is must_not_suspend, but it's not actually blocking.

compiler-errors · 2024-05-17T19:11:11Z

Yeah, I think the main distinction is that values must not suspend, but calls are blocking.

I am curious if these can be unified futher, but worst case must_not_suspend should likely be moved to the diagnostic namespace if and before it is stabilized.

estebank · 2024-05-17T19:24:08Z

I am curious if these can be unified futher

It might be that this annotation on functions could influence the logic of the must_not_suspend lint.

worst case must_not_suspend should likely be moved to the diagnostic namespace if and before it is stabilized.

I agree.

kennytm · 2024-05-18T03:15:42Z

what functions besides std::thread::sleep() can be reliably marked as #[diagnostic::blocking]?

Lint idea: find known-blocking constructs in async contexts rust-clippy#4377 mentioned println!(). Though in general the std::io can be put into non-blocking mode.
to_socket_addrs() is also mentioned, but the impl on (IpAddr, u16) / SocketAddr / &[SocketAddr] is definitely purely non-blocking. And then there is an impl on generic &T which may or may not be blocking depending on T...
Mutex and RwLock are also blocking but as suggested in tokio::sync::Mutex's doc one should use std::sync::Mutex when the critical section does not cross the .await point

kennytm · 2024-05-18T03:18:10Z

text/3639-diagnostic-blocking.md

+# Unresolved questions
+[unresolved-questions]: #unresolved-questions
+
+- What parts of the design do you expect to resolve through the RFC process before this gets merged?
+- What parts of the design do you expect to resolve through the implementation of this feature before stabilization?
+- What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC?


fill out or delete the placeholder text?

estebank · 2024-05-18T04:14:20Z

what functions besides std::thread::sleep() can be reliably marked as #[diagnostic::blocking]?

Effectively most of std::io and std::fs. Even println!() should likely be considered as a no-no from an async fn, but marking std::io::_print as blocking might be a bridge too far for some. Anything that performs IO of any kind in the stdlib should not be used and instead use tokio provided APIs.

weiznich · 2024-05-18T17:10:44Z

As a semi related note: As far as I remember one of the main motivations for the specification of diagnostic attribute namespaces was to outline a set of common rules for all attributes in this namespace, so that adding a new attribute to this namespace does not need a new RFC anymore.

nikomatsakis · 2024-05-21T08:21:45Z

I'm a big fan of this proposal. Using synchronous I/O from inside async functions is a common footman. It's worth noting that the same attribute would be useful to rayon and likely other parallel execution environments, as they too use a fixed size threadpool.

On Fri, May 17, 2024, at 2:42 PM, Esteban Kuber wrote: `#[diagnostic::blocking]` is a marker attribute for functions that are considered (by their author) to be a blocking operation, and as such shouldn't be invoked from an `async` function. `rustc`, `clippy` and other tools can use this signal to lint against calling these functions in `async` contexts. It would allow `rustc`, `clippy` and other tools to provide lints like `warning: async function `foo` can block --> $DIR/blocking-calls-in-async.rs:28:1 | 28 | async fn foo() { | -------------- 29 | interesting(); | ^^^^^^^^^^^^^`foo` is determined to block because it calls into `interesting` | note: `interesting` is considered to be blocking because it was explicitly marked as such --> $DIR/blocking-calls-in-async.rs:5:1 | 5 | #[diagnostic::blocking] | ^^^^^^^^^^^^^^^^^^^^^^^ 6 | fn interesting() {} | ----------------

`

…

Rendered <https://github.com/estebank/rfcs/blob/diagnostic-blocking/text/0000-diagnostic-blocking.md> You can view, comment on, or merge this pull request online at: #3639 Commit Summary • a3d24bb <a3d24bb> Add `#[diagnostic::blocking]` attribute File Changes (1 file <https://github.com/rust-lang/rfcs/pull/3639/files>) • *A* text/0000-diagnostic-blocking.md <https://github.com/rust-lang/rfcs/pull/3639/files#diff-07c20d9f37ddd58360b6b9ed610fea949cb1ffb81093f7363f9a4dbe3c4982c3> (82) Patch Links: • https://github.com/rust-lang/rfcs/pull/3639.patch • https://github.com/rust-lang/rfcs/pull/3639.diff — Reply to this email directly, view it on GitHub <#3639>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABF4ZWF5OZ2KTYKBOOV27LZCZFP7AVCNFSM6AAAAABH4TSV7OVHI2DSMVQWIX3LMV43ASLTON2WKOZSGMYDGMZWHEYDGMA>. You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

SOF3 · 2024-05-23T03:59:22Z

We could be more explicit with the definition of blocking.

thread::sleep is obviously blocking
anything that leads to futex(2) is probably agreed to be blocking (they should try_lock().expect("unlocked) if they are expected not to)
anything that waits for the filesystem or network is blocking.
- so we agree that DNS resolution is blocking too, even if it might fetch from the system cache?
are CPU-bound operations non-blocking?
- what if it is e.g. waiting for another core or waiting for a graphic card?
what about memory allocation? might memory allocation block?
- if we write to program memory that is actually mmap'ed to a large file, it blocks a lot as well.
and probably every other syscall not mentioned here, the blocking behavior of which might even be platform-dependent.

Note that CPU-bound operations and memory access cannot be warned using this lint.

the8472 · 2024-05-23T04:14:51Z

what functions besides std::thread::sleep() can be reliably marked as #[diagnostic::blocking]?

Effectively most of std::io and std::fs. Even println!() should likely be considered as a no-no from an async fn, but marking std::io::_print as blocking might be a bridge too far for some. Anything that performs IO of any kind in the stdlib should not be used and instead use tokio provided APIs.

Using Read::read is perfectly non-blocking if you're doing polled IO. It's even explicitly supported in tokio via AsyncFd.

kennytm · 2024-05-23T04:52:00Z

Using Read::read is perfectly non-blocking if you're doing polled IO. It's even explicitly supported in tokio via AsyncFd.

I don't think you'll run the poll loop directly inside asynccontext though. You wrap it inside a poll_fn() closure which the blocking lint shouldn't touch (??).

kornelski · 2024-05-23T11:39:25Z

I like this idea, and I think it's valuable to add it even if it's used just for thread::sleep(), since this is a particularly bad footgun that novice users run into when they explore how async works.

However, there's a big gray area of what is blocking and isn't: https://blog.yoshuawuyts.com/what-is-blocking/

Maybe this diagnostic could support specifying a level of "severity", which could be configurable in clippy?

thread::sleep and thread.join() are very likely to be blocking, but println! may be too pedantic. io::Read and io::Write are "it depends", because they're non-blocking when used with Vec<u8>.

riking · 2024-05-23T18:33:13Z

File IO is definitely the category of functions that needs the most programmer education of "yes, this is actually blocking", due to the prevalence of both networked file systems and disk hardware errors leading to unbounded waits.

the8472 · 2024-05-23T21:53:36Z

needs the most programmer education of "yes, this is actually blocking"

That's quite generational, depending on whether one has exp

[seek head noises]

erienced what came before SSDs.

programmerjake · 2024-05-23T22:54:43Z

I expect some uses of Mutex and similar to not count as blocking, e.g. the following should not count imho:

pub struct AtomicId(Mutex<[u8; 32]>);

impl AtomicId {
    pub const fn new(v: [u8; 32]) -> Self {
        Self(Mutex::new(v))
    }
    pub fn load(&self) -> [u8; 32] {
        *self.0.lock().unwrap()
    }
    pub fn store(&self, v: [u8; 32]) {
        *self.0.lock().unwrap() = v;
    }
}

JarredAllen · 2024-05-23T23:29:55Z

Can we also put this information into rustdoc output? That would have saved me some headache in the past from crates that poorly document whether a function blocks or returns a WouldBlock error if there isn't anything yet.

estebank · 2024-05-24T16:04:47Z

I expect some uses of Mutex and similar to not count as blocking, e.g. the following should not count imho

In the way that I'm doing analysis today, that use would indeed be caught (I'm doing transitive analysis), but in my case I have a complementary annotation to mark an item as "assume non blocking" and in this proposal the only checks being made would be for direct usages, so your wrapper would fly under the radar (regardless of whether it is really blocking or not). The correct analysis for Mutex would instead be if the value from .lock() goes past a yield point, but that's more complex than what's been proposed here.

tmandry · 2024-05-24T16:59:50Z

The correct analysis for Mutex would instead be if the value from .lock() goes past a yield point, but that's more complex than what's been proposed here.

That's exactly what #[must_not_suspend] is for. Maybe this proposal shouldn't touch locks then.

kpreid · 2024-05-25T01:09:14Z

I'd love to have this annotation and any sort of analysis for it.

I expect some uses of Mutex and similar to not count as blocking,

IMO, the best approach to this problem is that Mutex::lock() is considered blocking, and usage of it from async (or other should-not-block code) code would include an attribute (an #[allow] or #[expect], perhaps) that effectively declares “I have made sure that this lock is held only momentarily” (much like an unsafe {} block, but for blocking analysis: “you can't prove this is OK, but I tell you it is”).

Similarly, filesystem and stdio operations must in general be considered blocking, but a specific application might hold the position “I don't care about the near-zero amount of blocking that will happen due to these very occasional bits of text I write to stderr under normal circumstances”.

(All of this is out of scope of the RFC — unless it somehow leads to a conclusion that blocking is too hopelessly ambiguous to introduce the attribute at all.)

The correct analysis for Mutex would instead be if the value from .lock() goes past a yield point

That would not catch all cases. If the mutex might be held by non-async code for a long time, then async code calling lock() would block. A Mutex is OK to lock from async code if it is only locked momentarily by all its users, regardless of whether those users are async or blocking code.

kpreid · 2024-05-25T01:28:14Z

text/3639-diagnostic-blocking.md

+- What parts of the design do you expect to resolve through the implementation of this feature before stabilization?
+- What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC?
+
+# Future possibilities


Another future possibility that I think is worth mentioning: adding syntax to the attribute to (extensibly) classify the kind or degree of blocking, so as to be able to resolve “is this really a blocking operation” edge cases by writing down “it is if you call this blocking” in some fashion.

#[diagnostic::blocking(acquires_lock)] #[diagnostic::blocking(filesystem)] #[diagnostic::blocking(stderr)] #[diagnostic::blocking(::rayon::block_on_task)]

A note on that last one: if you're using Rayon, you care a lot about the difference between “this might block on the completion of another task in the Rayon thread pool” (great when inside the pool due to being able to yield to other tasks using the Rayon-specific mechanism, but bad from an async function) and “this might block acquiring a lock” (very dangerous to do from in the pool, because it can lead to deadlocks, as well as more ordinary CPU starvation due to blocking inside a thread-per-core pool).

Much like async, if you're not already attuned to this class of issue, it can be easy to make a mistake by accident, and it's hard to audit for, so being able to use the same kind of “don't call X from Y because you could deadlock/starve” analysis but with a narrower classification would be quite useful.

programmerjake · 2024-05-25T03:15:30Z

one place where blocking should be allowed is in things like genawaiter, which is a library for making generators built on top of the async/await mechanism, where it's like having an iterator that may block, which is fine.
there should probably be a way to opt-out of all blocking diagnostics for an async fn/block but still produce lints if there's another async fn/block nested inside. (actually, i've wanted that kind of allow but not recursively for other lints too)

kornelski · 2024-05-25T23:41:04Z

To me uncontended Mutex isn't "blocking". There are many usage patterns where the critical section is small and quick.
I don't think it's feasible for Rust to know how long any mutex will take to lock. Warning about all of them may be too noisy.

kennytm · 2024-05-26T04:24:09Z

one place where blocking should be allowed is in things like genawaiter, which is a library for making generators built on top of the async/await mechanism, where it's like having an iterator that may block, which is fine.

The genawaiter crate is a hack to make up for the lack of gen block, it shouldn't influence the decision made on this RFC.

And given that genawaiter is used like gen!({ ... }) the expanded macro could easily be #[allow(blocking)] async { ... }.

programmerjake · 2024-05-26T04:45:00Z

And given that genawaiter is used like gen!({ ... }) the expanded macro could easily be #[allow(blocking)] async { ... }.

I'm saying there should be an attribute like #[allow(blocking)]:

#[nonrecursive_allow(blocking)]
async {
    blocking(); // won't warn
    foo(async {
        blocking(); // warns
    });
}

SOF3 · 2024-05-27T05:26:42Z

@kornelski technically an uncontended mutex should be .try_lock().unwrap() instead of .lock(). If you expect try_lock to fail, it means the mutex is not uncontended. It might be insignificantly blocking, but it is not uncontended.

kornelski · 2024-05-27T10:53:50Z

Nah, try_lock introduces unnecessary unreliability. I don't mean literally never blocked, but rather not having many threads trying to lock it, and having critical section quick enough, so that threads won't wait long for the lock.

In practice, blocking is only an issue when poll takes longer than a maximum latency an application will tolerate.
This is why there's a big gray area in what counts as blocking, because many synchronous operations could finish quickly enough, depending on how they're used, and what latency is allowed (e.g. in a server for a realtime multiplayer game 1ms may be long enough to be blocking, but a media file server could have 100ms hiccups without causing problems).

conradludgate · 2024-05-30T08:14:01Z

To add some further fuel: I am very much against Mutex::lock being 'blocking'. A mutex with a very short critical section will always be faster than an async lock like tokio's because tokio uses a std::sync::Mutex for the wait list:
https://docs.rs/tokio/1.37.0/src/tokio/sync/batch_semaphore.rs.html#36

If tokio considers a mutex to be fast enough, then so should we. It would be very annoying to have to annotate the many such uses in my codebases.

That being said, I do respect that many use cases of mutex in async code has the potential to be slow, and I am supportive of static analysis and lints to try and improve the status quo. However, even a hashmap realloc I have measured to be several ms. An async lock won't fix that if you do tokio::sync::Mutex<HashMap<K, V>> for your concurrent map.

Onto a slightly related but different topic, I'd like to see some investment into tools for debugging actual blocking in code. For instance, a watcher thread that monitors runtime threads, periodically polling them and sampling stack traces. If it detects that a thread has taken a while to yield, it can dump the stacktrace so you can debug uses of loop {} without any awaits.

SOF3 · 2024-05-30T10:53:07Z

what about we split them to different types of blocking functions (io, net, mutex) and just expose them as separate (maybe somehow parameterized) lint's?

the8472 · 2024-05-30T13:11:04Z

Those categories do not carve reality along its joints. File IO can be faster or slower than network IO. Mutexes depends on the critical section, contention, potential for priority inversion, ...
If you consider swap even a ptr::read can be blocking.

If you want to know if library calls are fast and compose well, then you need them to be

wait-free
constant-time regarding their inputs
have some latency estimate

Then you can add up all the constant latencies and see if they are below your responsiveness goal for a feature of your application.

If things are not constant-time then you can quickly get O(n²) behavior due to loops. If you do not know the expected latency of calls then all that nice constant-timeness is useless if the constants are huge. If they aren't wait-free then you run into starvation situations based on system state.

Making low-latency code legible to the compiler is hard. Perhaps there's some prior art in hard realtime programming that we can look at?

JarredAllen · 2024-05-30T17:00:14Z

For me personally, I'd prefer if mutexes were listed as blocking. If a mutex guards a short enough critical section, so then (using the lint reasons feature that's about to become stable) there will be a place where I can document that assumption in the code:

async fn something_with_mutex() {
    static SHARED_MUTEX: Mutex<_> = ..;

    #[allow(
        blocking_in_async,
        reason = "This mutex only guards `do_something` calls, which finish quickly enough."
    )]
    SHARED_MUTEX.lock().do_something();
}

To me, a good part of the Rust language is the many ways in which it forces me to do my homework and show that what I'm doing is correct, and this feels like another way to do that.

lolbinarycat · 2024-06-10T03:23:10Z

Perhaps there's some prior art in hard realtime programming that we can look at?

i feel like once your latency requirements become strict enough the only solution is nonlocal control flow (signals, interrupts, or preemptive multitasking)

and to add some fuel to the "what does blocking even mean" fire, is std::thread::sleep(Duration::from_nanos(1)) a blocking operation?

i think trying to estimate performance using static analysis is outside the scope of the compiler, and would be better suited with dynamic analysis (measuring time between waits)

is there even a way to manually yield an async fn when doing long cpu bound processing like keygen? it seems like you would want to await a future that returns "Pending" exactly once.

conradludgate · 2024-06-10T04:29:13Z

is there even a way to manually yield an async fn when doing long cpu bound processing like keygen? it seems like you would want to await a future that returns "Pending" exactly once.

Yes. Tokio offers a yield_now function which is very easy to remake. It calls wake immediately and returns pending once. I have previously used that during password hashing, for instance: https://github.com/neondatabase/neon/blob/a5ecca976ec3abf97c8c3db4f78c230b4553b509/proxy/src/scram/exchange.rs#L98

Alternatively tokio has a concept of a "cooperative budget", which is not yet stable api tokio-rs/tokio#6622, but it will only return pending if the budget is fully consumed. Budget is acquired by most tokio functions like reading data from a TcpStream or sending into a channel (both of these otherwise could always be immediately ready in a loop and never yield)

fintelia · 2024-06-11T20:56:54Z

text/3639-diagnostic-blocking.md

+The final version would be to implement a call-graph analyisis lint so that transitive calls to blockign operations also produce a warning, but this is explicitly out of scope of this RFC.
+
+# Drawbacks
+[drawbacks]: #drawbacks


Another potential drawback is that, um, "enthusiastic" contributors might make unwelcome PRs against a large number of crates adding this annotation to basically every method. Thoughtful documentation/messaging around the feature might be able to mitigate that risk

Any thoughts on a warning for unnecessarily applying #[diagnostic::blocking] on functions that can already be inferred to be blocking? That could help reduce some noise as well.

HyeonuPark · 2024-07-17T01:39:08Z

Prior discussion from IRLO based on runtime detection and panicking.

estebank force-pushed the diagnostic-blocking branch from a3d24bb to d3afb4c Compare May 17, 2024 18:43

estebank mentioned this pull request May 17, 2024

Lint idea: find known-blocking constructs in async contexts rust-lang/rust-clippy#4377

Open

estebank added the T-lang Relevant to the language team, which will review and decide on the RFC. label May 18, 2024

estebank changed the title ~~Add #[diagnostic::blocking] attribute~~ [RFC] Add #[diagnostic::blocking] attribute May 18, 2024

kennytm reviewed May 18, 2024

View reviewed changes

kpreid reviewed May 25, 2024

View reviewed changes

weiznich mentioned this pull request May 30, 2024

[RFC] On_unimplemented_trait_use #3643

Open

1 task

traviscross added the WG-async Relevant to the async working group. label Jun 10, 2024

fintelia reviewed Jun 11, 2024

View reviewed changes

[RFC] Add #[diagnostic::blocking] attribute #3639

Are you sure you want to change the base?

[RFC] Add #[diagnostic::blocking] attribute #3639

Conversation

estebank commented May 17, 2024 • edited Loading

compiler-errors commented May 17, 2024

estebank commented May 17, 2024

davidbarsky commented May 17, 2024

compiler-errors commented May 17, 2024

estebank commented May 17, 2024

kennytm commented May 18, 2024

kennytm May 18, 2024

Choose a reason for hiding this comment

estebank commented May 18, 2024

weiznich commented May 18, 2024

nikomatsakis commented May 21, 2024 via email

SOF3 commented May 23, 2024 • edited Loading

the8472 commented May 23, 2024

kennytm commented May 23, 2024

kornelski commented May 23, 2024

riking commented May 23, 2024

the8472 commented May 23, 2024

programmerjake commented May 23, 2024 • edited Loading

JarredAllen commented May 23, 2024

estebank commented May 24, 2024

tmandry commented May 24, 2024

kpreid commented May 25, 2024

kpreid May 25, 2024

Choose a reason for hiding this comment

programmerjake commented May 25, 2024

kornelski commented May 25, 2024

kennytm commented May 26, 2024

programmerjake commented May 26, 2024

SOF3 commented May 27, 2024 • edited Loading

kornelski commented May 27, 2024

conradludgate commented May 30, 2024 • edited Loading

SOF3 commented May 30, 2024

the8472 commented May 30, 2024

JarredAllen commented May 30, 2024

lolbinarycat commented Jun 10, 2024

conradludgate commented Jun 10, 2024

fintelia Jun 11, 2024

Choose a reason for hiding this comment

SOF3 Jun 12, 2024

Choose a reason for hiding this comment

HyeonuPark commented Jul 17, 2024

[RFC] Add `#[diagnostic::blocking]` attribute #3639

[RFC] Add `#[diagnostic::blocking]` attribute #3639

estebank commented May 17, 2024 •

edited

Loading

SOF3 commented May 23, 2024 •

edited

Loading

programmerjake commented May 23, 2024 •

edited

Loading

SOF3 commented May 27, 2024 •

edited

Loading

conradludgate commented May 30, 2024 •

edited

Loading