You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the context of da62e65, pay attention to this code:
// src/runnable.rs:515// Allocate large futures on the heap.let ptr = if mem::size_of::<Fut>() >= 2048{let future = |meta| {let future = future(meta);Box::pin(future)};RawTask::<_,Fut::Output,S,M>::allocate(future, schedule,self)}else{RawTask::<Fut,Fut::Output,S,M>::allocate(future, schedule,self)};
The future is boxed whenever its size is greater or equal to 2048 bytes. Comment tries to elaborate on this condition but it seems not to notice that RawTask::allocate already places the future on the heap next to the header. I don't see any reason to additionally box the future even if it is large by itself.
However one might argue that running future shares space with its output result after the completion, thus boxed future would deallocate the space which that future would have occupied as soon as it completes, possibly saving some space until its Task object is consumed in some way. Then however the right condition would be to compare future's size with its output type's size, to make sure there is a substantial difference between them to save on:
Even then that approach would be unsubstantiated, due to not having data to show any practical benefit. There could be not enough delay while completed task is held in memory to cause substantial memory usage cost.
It also introduces an additional pointer indirection which cannot be avoided, even if user would have liked to do so. However in case this branch is removed, user would be able to box up the future on their side to reach the old behavior. Given this library's purpose is to use it to "create a custom async executor", I am inclined to believe latter choice would be more appropriate, while this behavior and possible optimization should probably be noticed in the documentation as well.
I've even made some simple benchmarks to check how the code would perform with and without the Box::pin branch:
That would be good, but I only can see the opposite being true. Particularly if you change data array size from 0x4_0000 to 0x10_0000 from aforementioned benchmarks you'll get the stack overflow until you actually delete that branch. This indicates that the optimizer is able to perform in-place initialization, even if it doesn't use Box type, which does make sense considering there's no significant difference for LLVM. However it also seem like that the optimizer struggles with the added complexity from this if branch.
There could have been a possibility that this change did help to avoid stack overflow, considering it did not create a new closure back then change was introduced, but called Box::pin directly inside of the discussed branch (see d3011bf). Even if that was true (besides that nobody paid attention or got any stack overflows I've encountered), you can make a conclusion this feature is not functional.
In the context of da62e65, pay attention to this code:
The future is boxed whenever its size is greater or equal to 2048 bytes. Comment tries to elaborate on this condition but it seems not to notice that
RawTask::allocate
already places the future on the heap next to the header. I don't see any reason to additionally box the future even if it is large by itself.However one might argue that running future shares space with its output result after the completion, thus boxed future would deallocate the space which that future would have occupied as soon as it completes, possibly saving some space until its
Task
object is consumed in some way. Then however the right condition would be to compare future's size with its output type's size, to make sure there is a substantial difference between them to save on:Even then that approach would be unsubstantiated, due to not having data to show any practical benefit. There could be not enough delay while completed task is held in memory to cause substantial memory usage cost.
It also introduces an additional pointer indirection which cannot be avoided, even if user would have liked to do so. However in case this branch is removed, user would be able to box up the future on their side to reach the old behavior. Given this library's purpose is to use it to "create a custom async executor", I am inclined to believe latter choice would be more appropriate, while this behavior and possible optimization should probably be noticed in the documentation as well.
I've even made some simple benchmarks to check how the code would perform with and without the
Box::pin
branch:I have found no noticeable difference, while roughly accounting for instability of benchmark results.
This issue also relates to #66, which highlights one of drawbacks of this decision.
The text was updated successfully, but these errors were encountered: