Performance regression in 0.18 (at least compared to 0.12) #357

peixian · 2024-02-22T18:34:11Z

Adding this in case other people run into it. I'm still trying to figure out what the root cause is, but it seems like configuring the pool size affects runtime latency on the newest version.

On 0.12.0, we were running with 300 connections to each Scylla node underneath. On 0.18.1, I tried it with 2 connections per node, as well as 300 connections per node in pool_size.

0.12.0, 300 connections:

Operating System: Linux
CPU Information: Intel(R) Xeon(R) CPU @ 2.80GHz
Number of Available Cores: 4
Available memory: 7.77 GB
Elixir 1.16.1
Erlang 26
JIT enabled: true

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 28 s

Benchmarking lines by game sharded 100 - 0.12.0 - 300 ...
Benchmarking lines single - 0.12.0 - 300 ...
Benchmarking lines specific - 0.12.0 - 300 ...
Benchmarking user single - 0.12.0 - 300 ...
Calculating statistics...
Formatting results...

Name                                                             ips        average  deviation         median         99th %
SELECT * FROM L WHERE id='123' - 0.18.1 - 300                    2.45 K        0.41 ms    ±38.85%        0.40 ms        0.70 ms
SELECT * FROM L LIMIT 1 - 0.18.1 - 300                           1.66 K        0.60 ms    ±36.74%        0.56 ms        1.34 ms
SELECT * FROM U LIMIT 1 - 0.18.1 - 300                           0.67 K        1.49 ms    ±27.70%        1.41 ms        2.43 ms
SELECT * FROM materialized_view LIMIT 1000 - 0.18.1 - 300        0.36 K        2.77 ms    ±28.29%        2.68 ms        4.69 ms

Comparison:
SELECT * FROM L WHERE id='123' - 0.18.1 - 300                    2.45 K
SELECT * FROM L LIMIT 1 - 0.18.1 - 300                           1.66 K - 1.48x slower +0.194 ms
SELECT * FROM U LIMIT 1 - 0.18.1 - 300                           0.67 K - 3.66x slower +1.08 ms
SELECT * FROM materialized_view LIMIT 1000 - 0.18.1 - 300        0.36 K - 6.78x slower +2.36 ms

0.18.1, 300 connections:

Name                                                                ips        average  deviation         median         99th %
SELECT * FROM L WHERE id='123' - 0.18.1 - 300                    1.18 K        0.85 ms    ±35.47%        0.80 ms        1.63 ms
SELECT * FROM L LIMIT 1 - 0.18.1 - 300                           1.01 K        0.99 ms    ±32.83%        0.95 ms        1.73 ms
SELECT * FROM U LIMIT 1 - 0.18.1 - 300                           0.53 K        1.90 ms    ±29.20%        1.80 ms        4.61 ms
SELECT * FROM materialized_view LIMIT 1000 - 0.18.1 - 300        0.30 K        3.35 ms    ±16.94%        3.27 ms        5.68 ms

Comparison:
SELECT * FROM L WHERE id='123' - 0.18.1 - 300                    1.18 K
SELECT * FROM L LIMIT 1 - 0.18.1 - 300                           1.01 K - 1.17x slower +0.147 ms
SELECT * FROM U LIMIT 1 - 0.18.1 - 300                           0.53 K - 2.25x slower +1.05 ms
SELECT * FROM materialized_view LIMIT 1000 - 0.18.1 - 300        0.30 K - 3.97x slower +2.51 ms

0.18.1, 5 connections:

Name                                                              ips        average  deviation         median         99th %
SELECT * FROM L WHERE id='123' - 0.18.1 - 5                    1.39 K        0.72 ms    ±47.16%        0.68 ms        1.34 ms
SELECT * FROM L LIMIT 1 - 0.18.1 - 5                           1.17 K        0.85 ms    ±43.76%        0.81 ms        1.41 ms
SELECT * FROM U LIMIT 1 - 0.18.1 - 5                           0.62 K        1.60 ms    ±29.08%        1.54 ms        2.42 ms
SELECT * FROM materialized_view LIMIT 1000 - 0.18.1 - 5        0.35 K        2.82 ms    ±16.92%        2.79 ms        4.01 ms

Comparison:
SELECT * FROM L WHERE id='123' - 0.18.1 - 5                    1.39 K
SELECT * FROM L LIMIT 1 - 0.18.1 - 5                           1.17 K - 1.19x slower +0.135 ms
SELECT * FROM U LIMIT 1 - 0.18.1 - 5                           0.62 K - 2.22x slower +0.88 ms
SELECT * FROM materialized_view LIMIT 1000 - 0.18.1 - 5        0.35 K - 3.92x slower +2.10 ms

I also reordered and randomly ran the benchmark multiple times to make sure it wasn't some cache issue.

Configuration:

[
  nodes: [<NODES>],
  autodiscovery: false,
  pool_size: 300, # or 2 in the case of tests.
]

The text was updated successfully, but these errors were encountered:

peixian · 2024-02-23T17:26:36Z

Looking into this a little bit more, I originally thought it was https://github.com/whatyouhide/xandra/blob/main/lib/xandra/cluster/connection_pool.ex#L14-L20, which would explain the query time scaling with pool size, but it seems to not be the case. Even setting the pool size to 1 results in ~.7ms for the simple select case.

I pulled some perf graphs (right click and download and open in a new table for zooming in, the svg is interactive), but I suspect something else is happening.

Xandra 0.12.0:

Xandra 0.18.1:

whatyouhide · 2024-02-26T08:18:25Z

Ah, this is super interesting, thank you @peixian! Fantastic report and investigation. 💟

I’m no expert in flame graphs, but from what I can tell a lot of the time in 0.18 is spent just taking the next (random) ID out of the set of free stream IDs. Basically, this code:

    {stream_id, data} =
      get_and_update_in(data.free_stream_ids, fn ids ->
        id = Enum.at(ids, 0)
        {id, MapSet.delete(ids, id)}
      end)

I don't think we can get this to go much faster with the current data structures (a set of free stream IDs), but we could definitely store the IDs as a list and just take the head of the list every time, which should be significantly cheaper to do. Do you want to give that a try and measure perf for that?

harunzengin · 2024-02-26T10:12:54Z

@whatyouhide @peixian , I also noticed the performance problem while fetching random ids. I solved it using Enum.random/1 on a range in 3cedab5, which should run in constant time. It is part of #355

    random_id = Enum.random(1..@max_cassandra_stream_id)

    if MapSet.member?(timed_out_ids, random_id) or Map.has_key?(in_flight_requests, random_id) do
      random_free_stream_id(in_flight_requests, timed_out_ids)
    else
      random_id
    end

This is what I'm refering to: https://elixirforum.com/t/complexity-of-enum-functions/10493/3

whatyouhide · 2024-02-26T10:15:30Z

Getting random IDs until you get a free one works for now but it's going to perform worse the less free IDs you have. Later on, let's go with the list, but let's do one thing at a time to avoid cluttering too much 🙃

peixian · 2024-02-29T19:43:28Z

@harunzengin I just benchmarked your fix, we're a lot closer now.

0.12.0 - 300 connections. First chunk is non-prepared statements, second chunk is prepared statements.

Name                                                                ips        average  deviation         median         99th %
SELECT * FROM L WHERE id='123' - 0.12.0 - 300                    2.33 K        0.43 ms    ±65.27%        0.39 ms        1.22 ms
SELECT * FROM L LIMIT 1 - 0.12.0 - 300                           1.37 K        0.73 ms    ±72.25%        0.60 ms        3.32 ms
SELECT * FROM U LIMIT 1 - 0.12.0 - 300                           0.70 K        1.42 ms    ±57.77%        1.25 ms        6.53 ms
SELECT * FROM materialized_view LIMIT 1000 - 0.12.0 - 300        0.41 K        2.47 ms    ±50.71%        2.33 ms        7.24 ms

Name                                                                           ips        average  deviation         median         99th %
SELECT * FROM L WHERE id='123' - prepared - 0.12.0 - 300                    2.74 K        0.37 ms    ±69.25%        0.33 ms        1.55 ms
SELECT * FROM L LIMIT 1 - prepared - 0.12.0 - 300                           1.88 K        0.53 ms    ±60.38%        0.46 ms        1.71 ms
SELECT * FROM U LIMIT 1 - prepared - 0.12.0 - 300                           0.77 K        1.29 ms    ±62.66%        1.12 ms        4.91 ms
SELECT * FROM materialized_view LIMIT 1000 - prepared - 0.12.0 - 300        0.38 K        2.63 ms    ±70.90%        2.26 ms       14.13 ms

0.18.1 - 300 connections

Name                                                                      ips        average  deviation         median         99th %
SELECT * FROM L WHERE id='123' - 0.18.1-patch - 300                   1318.97        0.76 ms    ±68.82%        0.68 ms        2.46 ms
SELECT * FROM U LIMIT 1 - 0.18.1-patch - 300                           641.62        1.56 ms    ±47.71%        1.43 ms        4.18 ms
SELECT * FROM L LIMIT 1 - 0.18.1-patch - 300                           572.80        1.75 ms    ±95.20%        1.18 ms        9.39 ms
SELECT * FROM materialized_view LIMIT 1000 - 0.18.1-patch - 300        338.01        2.96 ms    ±50.44%        2.72 ms        9.02 ms

Name                                                                                 ips        average  deviation         median         99th %
SELECT * FROM L WHERE id='123' - prepared - 0.18.1-patch - 300                    1.56 K        0.64 ms    ±38.78%        0.61 ms        1.44 ms
SELECT * FROM L LIMIT 1 - prepared - 0.18.1-patch - 300                           1.19 K        0.84 ms    ±40.93%        0.78 ms        2.59 ms
SELECT * FROM U LIMIT 1 - prepared - 0.18.1-patch - 300                           0.67 K        1.49 ms    ±44.91%        1.36 ms        3.78 ms
SELECT * FROM materialized_view LIMIT 1000 - prepared - 0.18.1-patch - 300        0.26 K        3.88 ms   ±118.73%        2.72 ms       31.30 ms

0.18.1 - 5 connections

Name                                                                    ips        average  deviation         median         99th %
SELECT * FROM L WHERE id='123' - 0.18.1-patch - 5                    1.70 K        0.59 ms    ±43.41%        0.55 ms        1.65 ms
SELECT * FROM L LIMIT 1 - 0.18.1-patch - 5                           1.37 K        0.73 ms    ±89.78%        0.59 ms        4.11 ms
SELECT * FROM U LIMIT 1 - 0.18.1-patch - 5                           0.64 K        1.57 ms    ±48.50%        1.41 ms        5.41 ms
SELECT * FROM materialized_view LIMIT 1000 - 0.18.1-patch - 5        0.33 K        3.04 ms    ±81.88%        2.55 ms       21.62 ms

Name                                                                               ips        average  deviation         median         99th %
SELECT * FROM L WHERE id='123' - prepared - 0.18.1-patch - 5                    1.74 K        0.57 ms    ±45.96%        0.52 ms        1.98 ms
SELECT * FROM L LIMIT 1 - prepared - 0.18.1-patch - 5                           1.52 K        0.66 ms    ±68.98%        0.57 ms        3.38 ms
SELECT * FROM U LIMIT 1 - prepared - 0.18.1-patch - 5                           0.52 K        1.92 ms    ±83.37%        1.45 ms       10.02 ms
SELECT * FROM materialized_view LIMIT 1000 - prepared - 0.18.1-patch - 5        0.30 K        3.34 ms    ±83.52%        2.64 ms       20.77 ms

However, it seems like large pool sizes still incur a pretty significant slowdown, with also significantly lower throughput (1.7k rps drops to 1.3k when pool sizes go from 5 -> 300). I'm testing some fixes in connection selection, I suspect it has to do with something there.

whatyouhide · 2024-03-03T13:33:53Z

@peixian I don't know how you're testing this, but one important thing to keep in mind is this: the change in 0.18 was done to dramatically increase concurrent performance on a single connection.

Before, one connection could only perform requests serially (but yes, with higher throughput since it was cheaper to make the single request).

Now, a single connection can make multiple concurrent requests. So, an ideal comparison would be something like this:

On 0.12, open say 50 connections and spawn 50 tasks that make 10k requests each.
On 0.18, do the same (50 connections) and see the difference.
On 0.18, do the same but with say 10 open connections, to see if the reduction in conns can match the throughput of the more conns in 0.12.

peixian · 2024-03-03T21:01:49Z

@whatyouhide Ah yeah moving to multiplexed connections is the primary reason we're upgrading to 0.18.1. We're hitting cluster limits for connections on Scylla.

We won't need 300 connections, but I suspect we'll still need a significant amount (~50-100), so figuring out the scaling costs with each additional connection will likely be important for us later on.

I'm benching this with benchee, with running different selects for 5 seconds at maximum throughput. At least for us, individual performance is less important than the raw amount of thoughput a node can handle.

I can do individual query performance as well, once I figure out a bit more on the pool scaling issue. Also waiting on Scylla to get back to me on if I they have info on stream_ids not returning for #356

whatyouhide · 2024-03-03T22:09:23Z

I'm benching this with benchee, with running different selects for 5 seconds at maximum throughput

The question is whether you're benching this with parallel: N in Benche (N > 1). If you don't bench in parallel, then the pool size won't even matter because you'll pick a conn to run the SELECT and then return it back to the pool before the next select.

whatyouhide · 2024-03-28T08:29:23Z

@peixian ping?

peixian · 2024-03-28T12:55:32Z

@whatyouhide we're actively testing the latest changes now, on our production workload it seems like average query time has gone from .4ms to .6ms, which matches to what the benchmarks had. I think this is a slight regression, but the latest changes seem promising. I'm going to test with dialing the max concurrent requests per connection down and seeing if it changes later this week.

kennethito · 2024-04-13T04:36:26Z

I have another facet to the performance regressions in .18. Let me know if this should be a separate issue.

The performance of large result sets has significantly worsened. One of the ones I'm looking at is about 100% slower, going from 400ms to 800ms.

I believe this is because the new connection code uses active mode sockets and by default is receiving something like a single packet of data per message (byte_size says about 1400). For each tiny message, we append it into the buffer and try to parse a frame from it.

I experimented a bit and while I don't know if this is the right way to solve this, if I set the buffer option on the transport options to some large number, it fixes the performance degredation.

For instance

Transport.setopts(data.transport, active: :once, buffer: 50000)

Thoughts?

peixian · 2024-04-22T15:51:04Z

I'm closing this, in our experiments the performance regression has disappeared with a combination of @harunzengin's and @kennethito's changes.

whatyouhide · 2024-04-22T15:52:09Z

That's metal news 🤘

peixian changed the title ~~0.18.1 is slower than 0.12.0~~ 0.18.1 appears to be slower than 0.12.0 Feb 22, 2024

peixian mentioned this issue Feb 23, 2024

Increased number of timeouts after DBConnection change in Xandra.Connection #356

Closed

whatyouhide changed the title ~~0.18.1 appears to be slower than 0.12.0~~ Performance regression in 0.18 (at least compared to 0.12) Feb 26, 2024

whatyouhide added the Kind:Bug label Feb 26, 2024

whatyouhide mentioned this issue Apr 13, 2024

Increase network buffer size #362

Closed

peixian closed this as completed Apr 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance regression in 0.18 (at least compared to 0.12) #357

Performance regression in 0.18 (at least compared to 0.12) #357

peixian commented Feb 22, 2024 •

edited

Loading

peixian commented Feb 23, 2024 •

edited

Loading

whatyouhide commented Feb 26, 2024

harunzengin commented Feb 26, 2024 •

edited

Loading

whatyouhide commented Feb 26, 2024

peixian commented Feb 29, 2024 •

edited

Loading

whatyouhide commented Mar 3, 2024

peixian commented Mar 3, 2024

whatyouhide commented Mar 3, 2024

whatyouhide commented Mar 28, 2024

peixian commented Mar 28, 2024

kennethito commented Apr 13, 2024

peixian commented Apr 22, 2024

whatyouhide commented Apr 22, 2024

Performance regression in 0.18 (at least compared to 0.12) #357

Performance regression in 0.18 (at least compared to 0.12) #357

Comments

peixian commented Feb 22, 2024 • edited Loading

peixian commented Feb 23, 2024 • edited Loading

whatyouhide commented Feb 26, 2024

harunzengin commented Feb 26, 2024 • edited Loading

whatyouhide commented Feb 26, 2024

peixian commented Feb 29, 2024 • edited Loading

whatyouhide commented Mar 3, 2024

peixian commented Mar 3, 2024

whatyouhide commented Mar 3, 2024

whatyouhide commented Mar 28, 2024

peixian commented Mar 28, 2024

kennethito commented Apr 13, 2024

peixian commented Apr 22, 2024

whatyouhide commented Apr 22, 2024

peixian commented Feb 22, 2024 •

edited

Loading

peixian commented Feb 23, 2024 •

edited

Loading

harunzengin commented Feb 26, 2024 •

edited

Loading

peixian commented Feb 29, 2024 •

edited

Loading