Allow output plugins to configure a max chunk size #1938

PettitWesley · 2020-02-08T01:45:51Z

@edsiper and I discussed this recently; opening an issue to track it.

Problem

Many APIs have a limit on the amount of data they can ingest per request. For example, #1187 discusses that the DataDog HTTP API has a 2MB payload limit. A single request is made per flush, and occasionally Fluent Bit can send a chunk which is over 2 MB.

Some APIs have a limit on the number of log messages they can accept per HTTP request. For example, Amazon CloudWatch has a 10,000 log message limit per PutLogEvents call. Amazon Kinesis Firehose and Amazon Kinesis Data Streams have a much smaller batch limit of 500 events.

Consequently, plugins have to implement logic to split a single chunk into multiple requests (or accept that occasionally large chunks will fail to be sent). This becomes troublesome when a single API request fails in the set. If the plugin issues a retry, the whole chunk will get retried. The fractions of the chunk that got successfully uploaded will thus be sent multiple times.

Possible Solutions

Ideal solution: Output Plugins specify a max chunk size

Ideally, plugins should only have to make a single request per flush. This keeps the logic in the plugin very simple and straightforward. The common task of splitting chunks into right-sized pieces could be placed in the core of Fluent Bit.

Each output plugin could give Fluent Bit a max chunk size.

Implementing this would involve some complexity. Fluent Bit should not allocate additional memory to split chunks into smaller pieces. Instead it can pass a pointer to a fraction of chunk to an output, and track when the entire chunk has successfully been sent.

Non-ideal, but easy solution

The most important issue is retries. If each flush had a unique ID associated with it, plugins could internally track whether a flush is a first attempt or a retry, and then track whether the entirety of a chunk had been sent or not.

This is not a good idea, it makes the plugin very complicated; I've included it for the sake of completeness.

JeffLuoo · 2020-07-27T18:11:46Z

Hi @PettitWesley I am wondering that what is the current progress on this issue right now?

PettitWesley · 2020-07-27T18:20:21Z

@JeffLuoo AFAIK, no work has been done on it yet.

Robert-turbo · 2020-08-29T23:36:48Z

The same problem is with sending gelf over http to Graylog.
When Flush in SERVICE is set to 5, than I can see only one message per 5 seconds in Graylog.
When I change Flush to 1, then, there is one message per second.

Is there anything I should put in configuration file to have all messages in Graylog? :)

ciastooo · 2021-09-06T07:43:41Z

@Robert-turbo were you able to solve this problem somehow?

Robert-turbo · 2021-09-07T11:10:49Z

@ciastooo I started to use tcp output

[OUTPUT]
    Name  gelf
    Match *
    Host tcp.yourdomain.com
    Port  12201
    Mode  tls
    tls  On
    tls.verify  On
    tls.vhost  tcp.yourdomain.com
    Gelf_Short_Message_Key  log
    Gelf_Timestamp_Key  timestamp

And used Traefik with TCP route (with SSL) in front of graylog
https://doc.traefik.io/traefik/routing/providers/docker/#tcp-routers

mohitjangid1512 · 2021-10-18T11:14:06Z

@PettitWesley Is this problem solved in any recent version ?

PettitWesley · 2021-10-19T04:37:20Z

@mohitjangid1512 No work has been done AFAIK on chunk sizing for outputs

matthewfala · 2022-01-12T20:18:51Z

Alternately, compromising between options 1 and 2, we could write some middleware that handles chunk splitting and chunk fragment retries and wraps the flush function. This could potentially limit changes to AWS plugins, and not require any changes to core code.

The middleware would take parameters flush_function_ptr,chunk_fragment_size,middleware_context and consist of:

Register chunk in some kind of table or hashmap, along with successful_chunks sent to -1
Break chunk up into data of size X
Call flush on a chunk fragment
If flush is successful, increment successful_chunks. OW return retry/fail
Return to 3 if remaining chunks exist
Return success

On retry, the chunk will be looked up and the chunk fragments will be resumed at index successful_chunks+1

This would allow no code to change in each plugin's "flush" function, but an additional function to be created called flush_wrapper that calls some middleware that takes the flush function's pointer, chunk_fragment_size,middleware_context

Just a thought.

github-actions · 2022-04-13T02:12:15Z

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

LionelCons · 2022-04-13T06:27:54Z

This issue is still creating problems in some of our workflows.

github-actions · 2022-08-01T02:18:58Z

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

LionelCons · 2022-08-01T06:22:03Z

This issue is still creating problems in some of our workflows.

sean-scott-lr · 2023-03-10T15:23:46Z

This issue is still creating problems in some of our workflows.

Agreed. Surprised its not implemented in the same manner as fluentD

leonardo-albertovich · 2023-03-10T17:47:37Z

Would you mind describing how is it implemented in fluentd? I am not familiar with it but I know that this is a very sensitive issue.

adiforluls · 2023-08-08T06:01:54Z

Hi @leonardo-albertovich @edsiper @PettitWesley , Fluentd's buffer has configuration to limit chunk size https://docs.fluentd.org/configuration/buffer-section (See chunk_limit_size). Ideally output plugins can have a chunk limit, but I also want a solution for this since it's a massive blocker and can render fluent-bit useless when payload size can become huge for outputs.

Jake0109 · 2023-10-17T08:08:06Z

hello, I just wonder is it work in process? We also meet this question. We are using fluent-bit and our output is AWS kinesis data stream, and data stream has a limit that one record should be under 1M, and we found chunks larger than 1. And as a result, We are meeting terrible data loss.
If the feature is released, Can you remind me please?

MathiasRanna · 2023-12-11T07:33:09Z

Waiting on solution for that also.

cameronattard · 2024-01-30T05:46:56Z

Is this ever going to be addressed?

thirdeyenick · 2024-08-23T08:09:46Z

We would also like to see this feature being integrated into fluent-bit somehow. We are sending some logs to a Loki instance and would like to limit the max size of the request.

braydonk · 2024-09-13T19:48:16Z

@edsiper @leonardo-albertovich I have opened #9385 as a proof of concept for a potential solution. Hopefully it will be useful for starting a discussion of how this can potentially be fixed more properly.

leonardo-albertovich · 2024-09-17T16:22:35Z

Beautiful, I'll review the PR as soon as possible, thanks for taking the time to tackle this hairy issue.

braydonk · 2024-09-27T19:25:33Z

@ryanohnemus and I have been iterating on this and have found some issues that my PR doesn't address.

Ryan found an important issue, which is that Filter plugins can easily add a lot of data even to chunks that are split up and we wind up right back at the problem of chunks ending up over the input chunk limit.

I've been trying to come up with alternatives but am running into snags on each. Here's what I have so far.

Break up the input chunk in engine dispatch, creating flush tasks for each broken down part of the chunk

The snag I ran into here is that the flb_task code is very intertwined with the original input chunk the data came from. If we take an input chunk and split it up, then those split up parts of the chunk don't have a proper input chunk to refer to, and things get messy.

Fix this just in `out_stackdriver` by splitting up the event chunk

The problem here is retrying. If one part of the chunk encounters a retryable error, you can't effectively signal back to the engine that only a certain part failed. (I discovered this during my own attempt, just to realize this was called out in the original post on this issue. 😄 )

Take the input chunk breakup code from my PR and move it after filters

EDIT: This is problematic too. The flb_filter_do code takes in an input chunk and uses it to track added/total records as well as doing input chunk tracing. I could probably remove that code but the chunk tracing stuff looks pretty useful.

This is the closest idea that I have right now. There's only one major problem that I can find so far, which is that processors on output plugins will not be covered by this. Data can still be added by output processors and it will not be caught by this procedure (in fact I'm not sure how to cover it given the other issues with doing this in the engine dispatch part of the Fluent Bit core).

@leonardo-albertovich @edsiper Please let me know if you have any other ideas for solving this.

sentoa · 2024-10-01T11:06:40Z

Waiting on solution on this

shuaich · 2024-10-04T00:30:57Z

Reading through pros and cons of proposed solutions, my understanding is there is no easy solution to enforce the max chunk size throughput the pipeline, as additional data can be added in different stages of the pipeline.

If there is no single solution that address all scenarios/use cases, could we prioritize enforcing max chunk size in input plugins, as proposed in #9385, which I think addresses the major use case?

ryanohnemus · 2024-10-04T14:20:19Z

@shuaich While the change by @braydonk helps break up input chunk sizes which may help some users get past the issue, it still doesn't resolve the issue entirely.

As an input chunk (ic) is prepared and passed to an output plugin, the output thread (whether in an actual thread or in the main thread) calls flb_output_flush_create which creates an flb_output_flush record that is currently tied 1 to 1 with the output instance and the full ic.

flb_output_flush_create looks if any processors have been configured and then passes the ic through them which can greatly increase the size of each record. For example the kubernetes plugin can generally add tons of labels, etc directly into what becomes known as the processed_event_chunk and you wind up again with a chunk that is too large.

That flb_output_flush object is used to call the output instance's flush callback, which is required to call FLB_OUTPUT_RETURN, which signals (OK, RETRY, or ERROR) back to the engine.

I think a solution would be to have the ability to split the processed_event_chunk into multiple chunks after all processors have run. The problem with this is that it would be major changes to how output handling & FLB_OUTPUT_RETURN would work. It's not as simple as breaking processed_event_chunk into smaller chunks and looping over them, as each call to the output instance's flush callback calls FLB_OUTPUT_RETURN which currently signals back to the engine and kills the coro.

We'd first have to break up the processed_event_chunk into multiple chunks, call the output instance flush callback for the first processed chunk, modify the FLB_OUTPUT_RETURN logic to mark the progress in the overall task (of how many records were actually processed), then if it's an FLB_OK return, recursively call the output instance flush callback until all chunks were processed or a retry/error occurs, before finally signaling back to the engine and killing the coro. This feels messy and gets worse if you have an FLB_RETRY return on one of these chunks, you'll have to ensure that the next flb_output_flush_create is aware and only creates a new processed_event_chunk(s) (which is unfortunately recreated with each retry afaik) for the records still waiting to be processed (ie the first pass the output instance handled 1 of 3 chunks properly with chunk 2 returning FLB_RETRY, the second pass through this logic should only create the 2 chunks left).

It seems doable but would be moderate/heavy lift and we should wait for the engine maintainers' thoughts on a solution IMO.

That being said, @braydonk's would help with the starting size of the ic before processing, so hopefully that can also be merged in.

shuaich · 2024-10-04T16:51:24Z

Thank you @ryanohnemus for more context and details.

LionelCons mentioned this issue Apr 23, 2021

Add an option to control the maximum size of JSON output #2393

Closed

github-actions bot added the Stale label Apr 13, 2022

github-actions bot removed the Stale label Apr 14, 2022

PettitWesley added AWS Issues with AWS plugins or experienced by users running on AWS feature-request labels May 2, 2022

github-actions bot added the Stale label Aug 1, 2022

PettitWesley added long-term Long term issues (exempted by stale bots) and removed Stale labels Aug 1, 2022

cameronattard mentioned this issue Jan 30, 2024

[BUG] HTTP input unable to handle large requests from fluent-bit opensearch-project/data-prepper#4037

Open

ryanohnemus mentioned this issue Sep 11, 2024

out_stackdriver: does not batch output records properly if passed a large chunk of records and can drop a majority of records #9374

Open

braydonk mentioned this issue Sep 13, 2024

[POC, Do not merge] input_chunk: split incoming buffer when it's too big #9385

Open

6 tasks

ryanohnemus mentioned this issue Sep 18, 2024

Performance Testing of Fluent-bit with several filters shows log processing falling < 5mb/s #9399

Open

lecaros added the community-feedback label Oct 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow output plugins to configure a max chunk size #1938

Allow output plugins to configure a max chunk size #1938

PettitWesley commented Feb 8, 2020

JeffLuoo commented Jul 27, 2020

PettitWesley commented Jul 27, 2020

Robert-turbo commented Aug 29, 2020

ciastooo commented Sep 6, 2021

Robert-turbo commented Sep 7, 2021

mohitjangid1512 commented Oct 18, 2021

PettitWesley commented Oct 19, 2021

matthewfala commented Jan 12, 2022

github-actions bot commented Apr 13, 2022

LionelCons commented Apr 13, 2022

github-actions bot commented Aug 1, 2022

LionelCons commented Aug 1, 2022

sean-scott-lr commented Mar 10, 2023

leonardo-albertovich commented Mar 10, 2023

adiforluls commented Aug 8, 2023

Jake0109 commented Oct 17, 2023

MathiasRanna commented Dec 11, 2023

cameronattard commented Jan 30, 2024

thirdeyenick commented Aug 23, 2024

braydonk commented Sep 13, 2024

leonardo-albertovich commented Sep 17, 2024

braydonk commented Sep 27, 2024 •

edited

Loading

sentoa commented Oct 1, 2024

shuaich commented Oct 4, 2024 •

edited

Loading

ryanohnemus commented Oct 4, 2024

shuaich commented Oct 4, 2024

Allow output plugins to configure a max chunk size #1938

Allow output plugins to configure a max chunk size #1938

Comments

PettitWesley commented Feb 8, 2020

Problem

Possible Solutions

Ideal solution: Output Plugins specify a max chunk size

Non-ideal, but easy solution

JeffLuoo commented Jul 27, 2020

PettitWesley commented Jul 27, 2020

Robert-turbo commented Aug 29, 2020

ciastooo commented Sep 6, 2021

Robert-turbo commented Sep 7, 2021

mohitjangid1512 commented Oct 18, 2021

PettitWesley commented Oct 19, 2021

matthewfala commented Jan 12, 2022

github-actions bot commented Apr 13, 2022

LionelCons commented Apr 13, 2022

github-actions bot commented Aug 1, 2022

LionelCons commented Aug 1, 2022

sean-scott-lr commented Mar 10, 2023

leonardo-albertovich commented Mar 10, 2023

adiforluls commented Aug 8, 2023

Jake0109 commented Oct 17, 2023

MathiasRanna commented Dec 11, 2023

cameronattard commented Jan 30, 2024

thirdeyenick commented Aug 23, 2024

braydonk commented Sep 13, 2024

leonardo-albertovich commented Sep 17, 2024

braydonk commented Sep 27, 2024 • edited Loading

Break up the input chunk in engine dispatch, creating flush tasks for each broken down part of the chunk

Fix this just in out_stackdriver by splitting up the event chunk

Take the input chunk breakup code from my PR and move it after filters

sentoa commented Oct 1, 2024

shuaich commented Oct 4, 2024 • edited Loading

ryanohnemus commented Oct 4, 2024

shuaich commented Oct 4, 2024

braydonk commented Sep 27, 2024 •

edited

Loading

Fix this just in `out_stackdriver` by splitting up the event chunk

shuaich commented Oct 4, 2024 •

edited

Loading