Multiple Multiline Filter Definitions Not Supported #5235

PettitWesley · 2022-04-04T03:58:13Z

Bug Report

If you put two multiline filter definitions in your conf and they both match the same logs this leads to a problem:

[FILTER]
     name                  multiline
     match                 *
     multiline.key_content log
     multiline.parser	   go

[FILTER]
     name                  multiline
     match                 *
     multiline.key_content log
     multiline.parser      multiline-regex-test

This leads to error messages like:

[2022/02/09 07:28:03] [error] [multiline] expected MAP type in first line state buffer

And can cause high memory usage and even cause Fluent Bit to crash.

Why does this happen?

This is because the multiline filter using an emitter input instance to re-emit completed records at the start of the Fluent Bit log pipeline.

In the multiline design #4309 I tried to prevent cycles by having the filter recognize its own in_emitter instance and not try to parse records from its own emitter. Unfortunately, this only solves the problem for a single filter instance. Two filters can lead to a cycle.

So let's go through an example to see why this happens:

Some input ingests chunk A, which contains a multiline
ML_FILTER_1 gets chunk A, and concatenates the records and emits them as Chunk B with the in_emitter
ML_FILTER_1 recognizes that Chunk B came from its own emitter, so passes it on unmodified.
ML_FITLER_2 gets Chunk B, processes the records (assume they match at least one parser) and then emits them as Chunk C with its in_emitter
ML_FILTER_1 gets chunk B, and concatenates the records (these records already passed through that filter and thus match) and emits them as Chunk C with the in_emitter. At this point, we are now at step 2 again, and we have reached an infinite loop.

Workaround 1: If you need only one parser applied to each log statement

The workaround is to only have a single filter definition but remember that you can use multiple parsers in a single definition. The Fluent Bit multiline filtr can only apply a single multiline parser to each log record; it will try each parser in the comma delimited list in order, and apply the first one that matches the log (i.e. use the first parser which has a start_state that matches the log).

This limitation means that each log record can only have 2 multiline parsers successfully applied to it. The first appliedparser can be defined with the tail multiline settings, and the second applied parser can be specified in a multiline filter definition.

So my example from above can become a single filter definition like so:

[FILTER]
     name                  multiline
     match                 *
     multiline.key_content log
     multiline.parser	   go, multiline-regex-test

Workaround 2: If logs can be differentiated by log tag

Another option is if you can have the Match pattern for each filter match different tags.

The text was updated successfully, but these errors were encountered:

PettitWesley · 2022-04-04T04:02:59Z

@edsiper I am not sure of the best way to fix this. I have two questions:

Is there ever a valid use case for having multiple multiline filter definitions? Or would you always want to do what I show above and use all parsers in one filter definition?
Should we just solve this by adding an option to change the tag when its re-emitter (more like rewrite_tag) so that there is no cycle because the concatenated records will have a different tag? I think users won't be satisfied with this because they will want their multilines to have the same tag as other non-multiline records from the same app.

edsiper · 2022-04-05T01:46:55Z

I can think about receiving docker logs through Forward, but you don't know if the format is Docker or CRI-o. besides that not sure what users could try to accomplish.

Note that multiline core already supports multiple formats so we can avoid the user configuring multilple independent filters.

Maybe we need the users to elaborate more on their use cases in this ticket

PettitWesley · 2022-04-05T20:37:55Z

@edsiper Cool, I will direct users to this issue to post about their needs and we can determine if we need to support multiple filters. For the time being, I will assume its not urgent and will submit a doc PR to clearly note this in the docs.

alexku7 · 2022-05-30T15:03:45Z

Hello @PettitWesley

Sometimes for simplicity it's very convenient to declare multiple multiline filters instead of playing with tagging the containers in the tail input. Although ,I agree that it could be logically incorrect solution.

Anyway, I would like to ask . Does it provide the same functionality when I specify multiple parsers in the same filter like you mentioned in the workaround? I mean
Does it the same and identical and will give us the same functionality and the same effect:

[FILTER]
name multiline
match *
multiline.key_content log
multiline.parser go, multiline-regex-test

vs two separated multiline filters?

drbugfinder-work · 2022-06-03T14:25:24Z

Maybe related to #5524 (comment)

PettitWesley · 2022-06-28T01:43:52Z

@alexku7 Yea, specifying multiple parsers should have the same effect as you'd want from multiple filter definitions... if it doesnt then please let us know.

github-actions · 2022-09-26T02:18:28Z

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

gvozdetsky · 2024-06-13T16:33:05Z

Just to clarify, but multiple multiline parsers in [INPUT] section are allowed? like multiline.parser docker, cri?

How to understand The two options separated by a comma means multi-format: try docker and cri multiline formats.

gvozdetsky · 2024-06-14T08:45:47Z

Ah, I misunderstood first! Add a pull request to correct documentation: fluent/fluent-bit-docs#1392

vorezal · 2024-08-20T20:51:56Z

Would a use case be potentially handling both a partial_message case as well as a language specific multi-line parser case? #4309 mentions two filter instances being required to handle both cases, but this issue seems to indicate doing so would cause an infinite loop.

PettitWesley added the status: waiting-for-triage label Apr 4, 2022

PettitWesley changed the title ~~Multiline Multiline Filter Definitions Not Supported~~ Multiple Multiline Filter Definitions Not Supported Apr 4, 2022

This was referenced Apr 4, 2022

Errors and crashing after upgrading to 1.8.12 version #4712

Closed

Multiline filter crashes after buffer limit is reached causing high cpu and memory usage #4940

Closed

Multiline log guidance aws/aws-for-fluent-bit#100

Closed

lecaros added waiting-for-user Waiting for more information, tests or requested changes community-feedback and removed status: waiting-for-triage labels Apr 11, 2022

github-actions bot added the Stale label Sep 26, 2022

PettitWesley added long-term Long term issues (exempted by stale bots) and removed Stale labels Sep 26, 2022

nokute78 mentioned this issue Jan 7, 2024

When two multiline parsers are used in filters, the pipeline breaks #8312

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple Multiline Filter Definitions Not Supported #5235

Multiple Multiline Filter Definitions Not Supported #5235

PettitWesley commented Apr 4, 2022 •

edited

Loading

PettitWesley commented Apr 4, 2022

edsiper commented Apr 5, 2022

PettitWesley commented Apr 5, 2022

alexku7 commented May 30, 2022 •

edited

Loading

drbugfinder-work commented Jun 3, 2022

PettitWesley commented Jun 28, 2022

github-actions bot commented Sep 26, 2022

gvozdetsky commented Jun 13, 2024 •

edited

Loading

gvozdetsky commented Jun 14, 2024

vorezal commented Aug 20, 2024

Multiple Multiline Filter Definitions Not Supported #5235

Multiple Multiline Filter Definitions Not Supported #5235

Comments

PettitWesley commented Apr 4, 2022 • edited Loading

Bug Report

Why does this happen?

Workaround 1: If you need only one parser applied to each log statement

PettitWesley commented Apr 4, 2022

edsiper commented Apr 5, 2022

PettitWesley commented Apr 5, 2022

alexku7 commented May 30, 2022 • edited Loading

drbugfinder-work commented Jun 3, 2022

PettitWesley commented Jun 28, 2022

github-actions bot commented Sep 26, 2022

gvozdetsky commented Jun 13, 2024 • edited Loading

gvozdetsky commented Jun 14, 2024

vorezal commented Aug 20, 2024

PettitWesley commented Apr 4, 2022 •

edited

Loading

alexku7 commented May 30, 2022 •

edited

Loading

gvozdetsky commented Jun 13, 2024 •

edited

Loading