Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

out_influxdb: allow stripping of tag prefix #9427

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ueli-g
Copy link

@ueli-g ueli-g commented Sep 26, 2024

This removes a defined prefix from measurement names which might otherwise be shared between many measurements in the same data bucket.

When writing to a range of different buckets, routing to the corresponding out_influxdb instances happens on tag matches. This change allows to match on tag prefixes, but strip them from the measurement name. This avoids having identical prefixes for all measurement names in the same data bucket.

To achieve this, read from char tag[] with an offset when writing the measurement name, provided the prefix matches the tag completely and the overlap is at most tag_length - 1 characters.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • [N/A] Run local packaging test showing all targets (including any new ones) build.
  • [N/A] Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

fluent/fluent-bit-docs#1468

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Removes a configured prefix from measurement names

Signed-off-by: Ueli Graf <[email protected]>
@ueli-g
Copy link
Author

ueli-g commented Sep 30, 2024

Example configuration file for this change - how to write to different buckets in the same DB without adding measurement name prefixes:

[SERVICE]
    flush           1
    Daemon          off
    Log_Level       debug

[INPUT]
    Name        dummy
    Tag         foo.somedata
    Dummy             {"msg": "This is foo", "value": 1.3123}

[INPUT]
    Name        dummy
    Tag         bar.stream.importantmessage
    Dummy             {"msg": "completed", "ID": "1234", "tags": ["ID"]}

[INPUT]
    Name        dummy
    Tag         bar.stream.somesensor
    Dummy             {"value": 1, "tags": ["source", "yours"]}

[OUTPUT]
    Name          influxdb
    Match         foo.*
    strip_prefix  foo.
    Host          localhost
    Port          8086
    Bucket        foo-bucket
    Org           foobarorg
    HTTP_Token    my-super-secret-auth-token

[OUTPUT]
    Name          influxdb
    Match         bar.*
    strip_prefix  bar.stream.
    Host          localhost
    Port          8086
    Bucket        bar-bucket
    Org           foobarorg
    HTTP_Token    my-super-secret-auth-token

@ueli-g
Copy link
Author

ueli-g commented Sep 30, 2024

Debug log output

Fluent Bit v3.2.0
* Copyright (C) 2015-2024 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

______ _                  _    ______ _ _           _____  __  
|  ___| |                | |   | ___ (_) |         |____ |/  | 
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __   / /`| | 
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / /   \ \ | | 
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /.___/ /_| |_
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/ \____(_)___/

[2024/09/30 08:25:33] [ info] Configuration:
[2024/09/30 08:25:33] [ info]  flush time     | 1.000000 seconds
[2024/09/30 08:25:33] [ info]  grace          | 5 seconds
[2024/09/30 08:25:33] [ info]  daemon         | 0
[2024/09/30 08:25:33] [ info] ___________
[2024/09/30 08:25:33] [ info]  inputs:
[2024/09/30 08:25:33] [ info]      dummy
[2024/09/30 08:25:33] [ info]      dummy
[2024/09/30 08:25:33] [ info]      dummy
[2024/09/30 08:25:33] [ info] ___________
[2024/09/30 08:25:33] [ info]  filters:
[2024/09/30 08:25:33] [ info] ___________
[2024/09/30 08:25:33] [ info]  outputs:
[2024/09/30 08:25:33] [ info]      influxdb.0
[2024/09/30 08:25:33] [ info]      influxdb.1
[2024/09/30 08:25:33] [ info] ___________
[2024/09/30 08:25:33] [ info]  collectors:
[2024/09/30 08:25:33] [ info] [fluent bit] version=3.2.0, commit=cb44828011, pid=7064
[2024/09/30 08:25:33] [debug] [engine] coroutine stack size: 24576 bytes (24.0K)
[2024/09/30 08:25:33] [ info] [storage] ver=1.5.2, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2024/09/30 08:25:33] [ info] [cmetrics] version=0.9.6
[2024/09/30 08:25:33] [ info] [ctraces ] version=0.5.5
[2024/09/30 08:25:33] [ info] [input:dummy:dummy.0] initializing
[2024/09/30 08:25:33] [ info] [input:dummy:dummy.0] storage_strategy='memory' (memory only)
[2024/09/30 08:25:33] [debug] [dummy:dummy.0] created event channels: read=24 write=25
[2024/09/30 08:25:33] [ info] [input:dummy:dummy.1] initializing
[2024/09/30 08:25:33] [ info] [input:dummy:dummy.1] storage_strategy='memory' (memory only)
[2024/09/30 08:25:33] [debug] [dummy:dummy.1] created event channels: read=26 write=27
[2024/09/30 08:25:33] [ info] [input:dummy:dummy.2] initializing
[2024/09/30 08:25:33] [ info] [input:dummy:dummy.2] storage_strategy='memory' (memory only)
[2024/09/30 08:25:33] [debug] [dummy:dummy.2] created event channels: read=28 write=29
[2024/09/30 08:25:33] [debug] [influxdb:influxdb.0] created event channels: read=30 write=31
[2024/09/30 08:25:33] [debug] [output:influxdb:influxdb.0] host=localhost port=8086
[2024/09/30 08:25:33] [debug] [influxdb:influxdb.1] created event channels: read=32 write=33
[2024/09/30 08:25:33] [debug] [output:influxdb:influxdb.1] host=localhost port=8086
[2024/09/30 08:25:33] [debug] [router] match rule dummy.0:influxdb.0
[2024/09/30 08:25:33] [debug] [router] match rule dummy.1:influxdb.1
[2024/09/30 08:25:33] [debug] [router] match rule dummy.2:influxdb.1
[2024/09/30 08:25:33] [ info] [sp] stream processor started
[2024/09/30 08:25:34] [debug] [task] created task=0x7f7e1802d730 id=0 OK
[2024/09/30 08:25:34] [debug] [task] created task=0x7f7e1802d8b0 id=1 OK
[2024/09/30 08:25:34] [debug] [task] created task=0x7f7e1802da00 id=2 OK
[2024/09/30 08:25:34] [debug] [upstream] KA connection #42 to localhost:8086 is connected
[2024/09/30 08:25:34] [debug] [http_client] not using http_proxy for header
[2024/09/30 08:25:34] [debug] [upstream] KA connection #43 to localhost:8086 is connected
[2024/09/30 08:25:34] [debug] [http_client] not using http_proxy for header
[2024/09/30 08:25:34] [debug] [upstream] KA connection #44 to localhost:8086 is connected
[2024/09/30 08:25:34] [debug] [http_client] not using http_proxy for header
[2024/09/30 08:25:34] [debug] [output:influxdb:influxdb.0] http_do=0 OK
[2024/09/30 08:25:34] [debug] [upstream] KA connection #42 to localhost:8086 is now available
[2024/09/30 08:25:34] [debug] [output:influxdb:influxdb.1] http_do=0 OK
[2024/09/30 08:25:34] [debug] [upstream] KA connection #43 to localhost:8086 is now available
[2024/09/30 08:25:34] [debug] [out flush] cb_destroy coro_id=0
[2024/09/30 08:25:34] [debug] [task] destroy task=0x7f7e1802d730 (task_id=0)
[2024/09/30 08:25:34] [debug] [out flush] cb_destroy coro_id=0
[2024/09/30 08:25:34] [debug] [task] destroy task=0x7f7e1802d8b0 (task_id=1)
[2024/09/30 08:25:34] [debug] [output:influxdb:influxdb.1] http_do=0 OK
[2024/09/30 08:25:34] [debug] [upstream] KA connection #44 to localhost:8086 is now available
[2024/09/30 08:25:34] [debug] [out flush] cb_destroy coro_id=1
[2024/09/30 08:25:34] [debug] [task] destroy task=0x7f7e1802da00 (task_id=2)
[2024/09/30 08:25:35] [debug] [task] created task=0x7f7e18039050 id=0 OK
[2024/09/30 08:25:35] [debug] [task] created task=0x7f7e1802d090 id=1 OK
[2024/09/30 08:25:35] [debug] [task] created task=0x7f7e1802d950 id=2 OK
[2024/09/30 08:25:35] [debug] [upstream] KA connection #42 to localhost:8086 has been assigned (recycled)
[2024/09/30 08:25:35] [debug] [http_client] not using http_proxy for header
[2024/09/30 08:25:35] [debug] [upstream] KA connection #43 to localhost:8086 has been assigned (recycled)
[2024/09/30 08:25:35] [debug] [http_client] not using http_proxy for header
[2024/09/30 08:25:35] [debug] [upstream] KA connection #44 to localhost:8086 has been assigned (recycled)
[2024/09/30 08:25:35] [debug] [http_client] not using http_proxy for header
[2024/09/30 08:25:35] [debug] [output:influxdb:influxdb.0] http_do=0 OK
[2024/09/30 08:25:35] [debug] [upstream] KA connection #42 to localhost:8086 is now available
[2024/09/30 08:25:35] [debug] [out flush] cb_destroy coro_id=1
[2024/09/30 08:25:35] [debug] [task] destroy task=0x7f7e18039050 (task_id=0)
[2024/09/30 08:25:35] [debug] [output:influxdb:influxdb.1] http_do=0 OK
[2024/09/30 08:25:35] [debug] [upstream] KA connection #44 to localhost:8086 is now available
[2024/09/30 08:25:35] [debug] [out flush] cb_destroy coro_id=3
[2024/09/30 08:25:35] [debug] [task] destroy task=0x7f7e1802d950 (task_id=2)
[2024/09/30 08:25:35] [debug] [output:influxdb:influxdb.1] http_do=0 OK
[2024/09/30 08:25:35] [debug] [upstream] KA connection #43 to localhost:8086 is now available
[2024/09/30 08:25:35] [debug] [out flush] cb_destroy coro_id=2
[2024/09/30 08:25:35] [debug] [task] destroy task=0x7f7e1802d090 (task_id=1)

@ueli-g
Copy link
Author

ueli-g commented Sep 30, 2024

valgrind memcheck output

==17905== Memcheck, a memory error detector
==17905== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==17905== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==17905== Command: ./build/bin/fluent-bit -c ./test.conf
==17905== 
Fluent Bit v3.2.0
* Copyright (C) 2015-2024 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

______ _                  _    ______ _ _           _____  __  
|  ___| |                | |   | ___ (_) |         |____ |/  | 
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __   / /`| | 
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / /   \ \ | | 
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /.___/ /_| |_
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/ \____(_)___/

[2024/09/30 08:36:39] [ info] Configuration:
[2024/09/30 08:36:39] [ info]  flush time     | 1.000000 seconds
[2024/09/30 08:36:39] [ info]  grace          | 5 seconds
[2024/09/30 08:36:39] [ info]  daemon         | 0
[2024/09/30 08:36:39] [ info] ___________
[2024/09/30 08:36:39] [ info]  inputs:
[2024/09/30 08:36:39] [ info]      dummy
[2024/09/30 08:36:39] [ info]      dummy
[2024/09/30 08:36:39] [ info]      dummy
[2024/09/30 08:36:39] [ info] ___________
[2024/09/30 08:36:39] [ info]  filters:
[2024/09/30 08:36:39] [ info] ___________
[2024/09/30 08:36:39] [ info]  outputs:
[2024/09/30 08:36:39] [ info]      influxdb.0
[2024/09/30 08:36:39] [ info]      influxdb.1
[2024/09/30 08:36:39] [ info] ___________
[.........]
==17905== 
==17905== HEAP SUMMARY:
==17905==     in use at exit: 0 bytes in 0 blocks
==17905==   total heap usage: 5,464 allocs, 5,464 frees, 12,504,833 bytes allocated
==17905== 
==17905== All heap blocks were freed -- no leaks are possible
==17905== 
==17905== For lists of detected and suppressed errors, rerun with: -s
==17905== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

@ueli-g ueli-g marked this pull request as ready for review September 30, 2024 08:53
@ueli-g ueli-g changed the title influxdb: allow stripping of tag prefix out_influxdb: allow stripping of tag prefix Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant