-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(transport): don't pre-allocate mtu on max_datagram_size #2086
Conversation
The `neqo_transport::Connection::max_datagram_size` creates an `Encoder`, writes a packet header and a packet number and determines how many bytes of the mtu are left. https://github.com/mozilla/neqo/blob/28f60bd0ba3209ecba4102eec123859a3a8afd45/neqo-transport/src/connection/mod.rs#L3408-L3427 The `Encoder` only has to hold the packet header and the packet number. Yet it is initialized with `Encoder::with_capacity(mtu)`. https://github.com/mozilla/neqo/blob/28f60bd0ba3209ecba4102eec123859a3a8afd45/neqo-transport/src/connection/mod.rs#L3408 Note that `PacketBuilder::short` and `PacketBuilder::long` read the `Encoder::capacity` through `PacketBuilder::infer_limit`. But `PacketBuilder::infer_limit` falls back to `2048` if the capacity is below `64`, which will be the case when using `Encoder::default()` instead of `Encoder::with_capacity(mtu)`. `2048` should be plenty enough for the packet header and the packet number. https://github.com/mozilla/neqo/blob/28f60bd0ba3209ecba4102eec123859a3a8afd45/neqo-transport/src/packet/mod.rs#L152-L180 https://github.com/mozilla/neqo/blob/28f60bd0ba3209ecba4102eec123859a3a8afd45/neqo-transport/src/packet/mod.rs#L188-L225 https://github.com/mozilla/neqo/blob/28f60bd0ba3209ecba4102eec123859a3a8afd45/neqo-transport/src/packet/mod.rs#L135-L141 This commit prevents the wasted allocation by using `Encoder::default()` instead of `Encoder::with_capacity(mtu)`.
Failed Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
All resultsSucceeded Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
Unsupported Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2086 +/- ##
=======================================
Coverage 95.35% 95.35%
=======================================
Files 112 112
Lines 36335 36335
=======================================
Hits 34648 34648
Misses 1687 1687 ☔ View full report in Codecov by Sentry. |
Benchmark resultsPerformance differences relative to d513712. coalesce_acked_from_zero 1+1 entries: No change in performance detected.time: [98.912 ns 99.221 ns 99.533 ns] change: [-0.6624% -0.2427% +0.2050%] (p = 0.28 > 0.05) coalesce_acked_from_zero 3+1 entries: No change in performance detected.time: [116.71 ns 117.06 ns 117.44 ns] change: [-7.1034% -2.5302% +0.1590%] (p = 0.28 > 0.05) coalesce_acked_from_zero 10+1 entries: No change in performance detected.time: [116.60 ns 117.17 ns 117.81 ns] change: [-0.5054% -0.0210% +0.4710%] (p = 0.94 > 0.05) coalesce_acked_from_zero 1000+1 entries: No change in performance detected.time: [97.471 ns 97.578 ns 97.698 ns] change: [-1.7988% -0.1792% +1.1365%] (p = 0.84 > 0.05) RxStreamOrderer::inbound_frame(): Change within noise threshold.time: [111.49 ms 111.61 ms 111.77 ms] change: [+0.2977% +0.4140% +0.5750%] (p = 0.00 < 0.05) transfer/pacing-false/varying-seeds: No change in performance detected.time: [26.516 ms 27.702 ms 28.914 ms] change: [-5.3154% +0.3008% +6.1247%] (p = 0.92 > 0.05) transfer/pacing-true/varying-seeds: No change in performance detected.time: [33.422 ms 35.001 ms 36.607 ms] change: [-12.155% -5.9755% +0.5737%] (p = 0.08 > 0.05) transfer/pacing-false/same-seed: No change in performance detected.time: [26.032 ms 26.871 ms 27.705 ms] change: [-4.2351% -0.0324% +4.5431%] (p = 0.99 > 0.05) transfer/pacing-true/same-seed: No change in performance detected.time: [41.561 ms 43.521 ms 45.509 ms] change: [-2.4773% +4.1611% +11.082%] (p = 0.22 > 0.05) 1-conn/1-100mb-resp (aka. Download)/client: No change in performance detected.time: [113.00 ms 113.54 ms 114.16 ms] thrpt: [875.97 MiB/s 880.77 MiB/s 884.94 MiB/s] change: time: [-1.0371% -0.3431% +0.3115%] (p = 0.34 > 0.05) thrpt: [-0.3106% +0.3442% +1.0480%] 1-conn/10_000-parallel-1b-resp (aka. RPS)/client: No change in performance detected.time: [313.38 ms 317.07 ms 320.68 ms] thrpt: [31.183 Kelem/s 31.538 Kelem/s 31.910 Kelem/s] change: time: [-2.0956% -0.4070% +1.2455%] (p = 0.63 > 0.05) thrpt: [-1.2302% +0.4087% +2.1404%] 1-conn/1-1b-resp (aka. HPS)/client: No change in performance detected.time: [33.693 ms 33.906 ms 34.133 ms] thrpt: [29.297 elem/s 29.493 elem/s 29.680 elem/s] change: time: [-1.2479% -0.3761% +0.5256%] (p = 0.42 > 0.05) thrpt: [-0.5228% +0.3775% +1.2637%] Client/server transfer resultsTransfer of 33554432 bytes over loopback.
|
neqo_transport::Connection::max_datagram_size
creates anEncoder
, writes a packet header and a packet number and determines how many bytes of the mtu are left.neqo/neqo-transport/src/connection/mod.rs
Lines 3408 to 3427 in 28f60bd
The
Encoder
only has to hold the packet header and the packet number. Yet it is initialized withEncoder::with_capacity(mtu)
, wheremtu
can be up to65535
bytes.neqo/neqo-transport/src/connection/mod.rs
Line 3408 in 28f60bd
Note that
PacketBuilder::short
andPacketBuilder::long
called bySelf::build_packet_header
read theEncoder::capacity
throughPacketBuilder::infer_limit
.neqo/neqo-transport/src/packet/mod.rs
Lines 152 to 180 in 28f60bd
neqo/neqo-transport/src/packet/mod.rs
Lines 188 to 225 in 28f60bd
But
PacketBuilder::infer_limit
falls back to2048
if the capacity is below64
, which will be the case when usingEncoder::default()
instead ofEncoder::with_capacity(mtu)
.2048
should be plenty enough for the packet header and the packet number.neqo/neqo-transport/src/packet/mod.rs
Lines 135 to 141 in 28f60bd
This commit prevents the wasted allocation by using
Encoder::default()
instead ofEncoder::with_capacity(mtu)
. The former is backed by an emptyVec
.Feel free to ignore if you don't think the reduction in memory allocation is worth the complexity in reasoning described above.