From 23dafe891a5bd543f2d0263d1289874f2cbf4966 Mon Sep 17 00:00:00 2001 From: Gabriel Oliveira Date: Thu, 18 Jul 2024 14:04:12 -0300 Subject: [PATCH 1/4] chore: add shared client performance disclaimer --- lib/broadway_kafka/producer.ex | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/lib/broadway_kafka/producer.ex b/lib/broadway_kafka/producer.ex index a2709bb..5d19fd2 100644 --- a/lib/broadway_kafka/producer.ex +++ b/lib/broadway_kafka/producer.ex @@ -51,7 +51,8 @@ defmodule BroadwayKafka.Producer do * `:shared_client` - Optional. When false, it starts one client per producer. When true, it starts a single shared client across all producers (which may reduce - memory/resource usage). Default is `false`. + memory/resource usage). May cause severe performance degradation, see + ["Shared Client Performance"](#module-shared-client-performance) for details. Default is `false`. * `:group_config` - Optional. A list of options used to configure the group coordinator. See the ["Group config options"](#module-group-config-options) section below for a list of all available @@ -215,6 +216,19 @@ defmodule BroadwayKafka.Producer do * `[:broadway_kafka, :assignments_revoked, :start | :stop | :exception]` spans - these events are emitted in "span style" when receiving assignments revoked call from consumer group coordinator See `:telemetry.span/3`. + + ## Shared Client Performance + + Enable shared client may drastically decrease performance. This happens because mutiple producers may block + each other waiting for the client response since the connection is hidden inside a process it becomes a + bottleneck. + + This is more likely to be an issue if the producers on your pipeline are fetching message from multiple topics + and specially if there are very low traffic topics in the mix because of batch wait times. + + In summary to mitigate this you can split your topics between multiple pipelines, but notice that this will + increase the resource usage as well creating one new client/connection for each pipeline effectively diminishing + the shared_client resource usage gains. """ use GenStage From e056c9a3f19b7bede01c90bc9c86b55c4f09704d Mon Sep 17 00:00:00 2001 From: Gabriel Oliveira Date: Thu, 18 Jul 2024 18:16:21 -0300 Subject: [PATCH 2/4] Update lib/broadway_kafka/producer.ex MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: José Valim --- lib/broadway_kafka/producer.ex | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/lib/broadway_kafka/producer.ex b/lib/broadway_kafka/producer.ex index 5d19fd2..37c2529 100644 --- a/lib/broadway_kafka/producer.ex +++ b/lib/broadway_kafka/producer.ex @@ -219,9 +219,8 @@ defmodule BroadwayKafka.Producer do ## Shared Client Performance - Enable shared client may drastically decrease performance. This happens because mutiple producers may block - each other waiting for the client response since the connection is hidden inside a process it becomes a - bottleneck. + Enabling shared client may drastically decrease performance. Since connection is handled by a single process, + producers may block each other waiting for the client response. This is more likely to be an issue if the producers on your pipeline are fetching message from multiple topics and specially if there are very low traffic topics in the mix because of batch wait times. From 273763fd7379fdecaf4401da01ef14492db80cb9 Mon Sep 17 00:00:00 2001 From: Gabriel Oliveira Date: Thu, 18 Jul 2024 18:16:52 -0300 Subject: [PATCH 3/4] Update lib/broadway_kafka/producer.ex MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: José Valim --- lib/broadway_kafka/producer.ex | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/broadway_kafka/producer.ex b/lib/broadway_kafka/producer.ex index 37c2529..63c1908 100644 --- a/lib/broadway_kafka/producer.ex +++ b/lib/broadway_kafka/producer.ex @@ -222,8 +222,8 @@ defmodule BroadwayKafka.Producer do Enabling shared client may drastically decrease performance. Since connection is handled by a single process, producers may block each other waiting for the client response. - This is more likely to be an issue if the producers on your pipeline are fetching message from multiple topics - and specially if there are very low traffic topics in the mix because of batch wait times. + This is more likely to be an issue if the producers on your pipeline are fetching message from + multiple topics and specially if there are very low traffic topics, which may block on batch wait times. In summary to mitigate this you can split your topics between multiple pipelines, but notice that this will increase the resource usage as well creating one new client/connection for each pipeline effectively diminishing From 6243d0cf10f9ff6f66634a7fa06e280f40d49293 Mon Sep 17 00:00:00 2001 From: Gabriel Oliveira Date: Thu, 18 Jul 2024 18:17:17 -0300 Subject: [PATCH 4/4] Update lib/broadway_kafka/producer.ex MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: José Valim --- lib/broadway_kafka/producer.ex | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/lib/broadway_kafka/producer.ex b/lib/broadway_kafka/producer.ex index 63c1908..0758920 100644 --- a/lib/broadway_kafka/producer.ex +++ b/lib/broadway_kafka/producer.ex @@ -225,9 +225,10 @@ defmodule BroadwayKafka.Producer do This is more likely to be an issue if the producers on your pipeline are fetching message from multiple topics and specially if there are very low traffic topics, which may block on batch wait times. - In summary to mitigate this you can split your topics between multiple pipelines, but notice that this will - increase the resource usage as well creating one new client/connection for each pipeline effectively diminishing - the shared_client resource usage gains. + To mitigate this, you can split your topics between multiple pipelines, but notice that this will + increase the resource usage as well. By creating one new client/connection for each pipeline, + you effectively diminishing the `shared_client` resource usage gains. So make sure to measure + if you enable this option. """ use GenStage