Merge pull request #1448 from fluent/add-workers-info

Add workers info
fluent · Aug 27, 2024 · 3ef1d7c · 3ef1d7c
2 parents af279ac + be9e5c0
commit 3ef1d7c
Show file tree

Hide file tree

Showing 41 changed files with 81 additions and 122 deletions.
diff --git a/pipeline/outputs/azure.md b/pipeline/outputs/azure.md
@@ -20,6 +20,7 @@ To get more details about how to setup Azure Log Analytics, please refer to the
 | Log_Type_Key | If included, the value for this key will be looked upon in the record and if present, will over-write the `log_type`. If not found then the `log_type` value will be used. | |
 | Time\_Key | Optional parameter to specify the key name where the timestamp will be stored. | @timestamp |
 | Time\_Generated | If enabled, the HTTP request header 'time-generated-field' will be included so Azure can override the timestamp with the key specified by 'time_key' option. | off |
+| Workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
 
 ## Getting Started
 
@@ -61,4 +62,3 @@ Another example using the `Log_Type_Key` with [record-accessor](https://docs.flu
     Customer_ID abc
     Shared_Key  def
 ```
-
diff --git a/pipeline/outputs/azure_blob.md b/pipeline/outputs/azure_blob.md
@@ -31,6 +31,7 @@ We expose different configuration properties. The following table lists all the
 | emulator\_mode | If you want to send data to an Azure emulator service like [Azurite](https://github.com/Azure/Azurite), enable this option so the plugin will format the requests to the expected format. | off |
 | endpoint | If you are using an emulator, this option allows you to specify the absolute HTTP address of such service. e.g: [http://127.0.0.1:10000](http://127.0.0.1:10000). |  |
 | tls | Enable or disable TLS encryption. Note that Azure service requires this to be turned on. | off |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
 
 ## Getting Started
 
@@ -128,4 +129,3 @@ Azurite Queue service is successfully listening at http://127.0.0.1:10001
 127.0.0.1 - - [03/Sep/2020:17:40:03 +0000] "PUT /devstoreaccount1/logs/kubernetes/var.log.containers.app-default-96cbdef2340.log HTTP/1.1" 201 -
 127.0.0.1 - - [03/Sep/2020:17:40:04 +0000] "PUT /devstoreaccount1/logs/kubernetes/var.log.containers.app-default-96cbdef2340.log?comp=appendblock HTTP/1.1" 201 -
 ```
-
diff --git a/pipeline/outputs/azure_kusto.md b/pipeline/outputs/azure_kusto.md
@@ -63,6 +63,7 @@ By default, Kusto will insert incoming ingestions into a table by inferring the
 | tag_key                     | The key name of tag. If `include_tag_key` is false, This property is ignored.                                                                                                                                                    | `tag`       |
 | include_time_key            | If enabled, a timestamp is appended to output. The key name is used `time_key` property.                                                                                                                                         | `On`        |
 | time_key                    | The key name of time. If `include_time_key` is false, This property is ignored.                                                                                                                                                  | `timestamp` |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
 
 ### Configuration File
 

diff --git a/pipeline/outputs/azure_logs_ingestion.md b/pipeline/outputs/azure_logs_ingestion.md
@@ -37,6 +37,7 @@ To get more details about how to setup these components, please refer to the fol
 | time\_key     | _Optional_ - Specify the key name where the timestamp will be stored. | `@timestamp` |
 | time\_generated | _Optional_ - If enabled, will generate a timestamp and append it to JSON. The key name is set by the 'time_key' parameter. | `true` |
 | compress      | _Optional_ - Enable HTTP payload gzip compression. | `true` |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
 
 ## Getting Started
 
@@ -58,7 +59,7 @@ Use this configuration to quickly get started:
     Name    tail
     Path    /path/to/your/sample.log
     Tag     sample
-    Key     RawData 
+    Key     RawData
 # Or use other plugins Plugin
 # [INPUT]
 #     Name    cpu

diff --git a/pipeline/outputs/bigquery.md b/pipeline/outputs/bigquery.md
@@ -59,6 +59,7 @@ You must configure workload identity federation in GCP before using it with Flue
 | pool\_id | GCP workload identity pool where the identity provider was created. Used to construct the full resource name of the identity provider. |  |
 | provider\_id | GCP workload identity provider. Used to construct the full resource name of the identity provider. Currently only AWS accounts are supported. |  |
 | google\_service\_account | Email address of the Google service account to impersonate. The workload identity provider must have permissions to impersonate this service account, and the service account must have permissions to access Google BigQuery resources (e.g. `write` access to tables) |  |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
 
 See Google's [official documentation](https://cloud.google.com/bigquery/docs/reference/rest/v2/tabledata/insertAll) for further details.
 
@@ -77,4 +78,3 @@ If you are using a _Google Cloud Credentials File_, the following configuration
     dataset_id my_dataset
     table_id   dummy_table
 ```
-
diff --git a/pipeline/outputs/chronicle.md b/pipeline/outputs/chronicle.md
@@ -34,6 +34,7 @@ Fluent Bit's Chronicle output plugin uses a JSON credentials file for authentica
 | log\_type | The log type to parse logs as. Google Chronicle supports parsing for [specific log types only](https://cloud.google.com/chronicle/docs/ingestion/parser-list/supported-default-parsers). |  |
 | region | The GCP region in which to store security logs. Currently, there are several supported regions: `US`, `EU`, `UK`, `ASIA`. Blank is handled as `US`.   |  |
 | log\_key | By default, the whole log record will be sent to Google Chronicle. If you specify a key name with this option, then only the value of that key will be sent to Google Chronicle. | |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
 
 See Google's [official documentation](https://cloud.google.com/chronicle/docs/reference/ingestion-api) for further details.
 

diff --git a/pipeline/outputs/cloudwatch.md b/pipeline/outputs/cloudwatch.md
@@ -34,6 +34,7 @@ See [here](https://github.com/fluent/fluent-bit-docs/tree/43c4fe134611da471e706b
 | profile               | Option to specify an AWS Profile for credentials. Defaults to `default`                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
 | auto\_retry\_requests | Immediately retry failed requests to AWS services once. This option does not affect the normal Fluent Bit retry mechanism with backoff. Instead, it enables an immediate retry with no delay for networking errors, which may help improve throughput when there are transient/random networking issues. This option defaults to `true`.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
 | external\_id          | Specify an external ID for the STS API, can be used with the role\_arn parameter if your role requires an external ID.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. Default: `1`. |
 
 ## Getting Started
 
@@ -80,28 +81,6 @@ The following AWS IAM permissions are required to use this plugin:
 }
 ```
 
-### Worker support
-
-Fluent Bit 1.7 adds a new feature called `workers` which enables outputs to have dedicated threads. This `cloudwatch_logs` plugin has partial support for workers in Fluent Bit 2.1.11 and prior. **2.1.11 and prior, the plugin can support a single worker; enabling multiple workers will lead to errors/indeterminate behavior.**
-Starting from Fluent Bit 2.1.12, the `cloudwatch_logs` plugin added full support for workers, meaning that more than one worker can be configured.
-
-Example:
-
-```
-[OUTPUT]
-    Name cloudwatch_logs
-    Match   *
-    region us-east-1
-    log_group_name fluent-bit-cloudwatch
-    log_stream_prefix from-fluent-bit-
-    auto_create_group On
-    workers 1
-```
-
-If you enable workers, you are enabling one or more dedicated threads for your CloudWatch output. 
-We recommend starting with 1 worker, evaluating the performance, and then enabling more workers if needed. 
-For most users, the plugin can provide sufficient throughput with 0 or 1 workers.
-
 ### Log Stream and Group Name templating using record\_accessor syntax
 
 Sometimes, you may want the log group or stream name to be based on the contents of the log record itself. This plugin supports templating log group and stream names using Fluent Bit [record\_accessor](https://docs.fluentbit.io/manual/administration/configuring-fluent-bit/classic-mode/record-accessor) syntax.

diff --git a/pipeline/outputs/datadog.md b/pipeline/outputs/datadog.md
@@ -25,6 +25,7 @@ Before you begin, you need a [Datadog account](https://app.datadoghq.com/signup)
 | dd_source       | _Recommended_ - A human readable name for the underlying technology of your service (e.g. `postgres` or `nginx`). If unset, Datadog will look for the source in the [`ddsource` attribute](https://docs.datadoghq.com/logs/log_configuration/pipelines/?tab=source#source-attribute).                                                                                                                                                                                                                                                                                                                   |                                  |
 | dd_tags         | _Optional_ - The [tags](https://docs.datadoghq.com/tagging/) you want to assign to your logs in Datadog. If unset, Datadog will look for the tags in the [`ddtags' attribute](https://docs.datadoghq.com/api/latest/logs/#send-logs).                                                                                                                                                                                                                                                                                                                                   |                                  |
 | dd_message_key  | By default, the plugin searches for the key 'log' and remap the value to the key 'message'. If the property is set, the plugin will search the property name key.                                                                                                                                                                                                                                                                          |                                  |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
 
 ### Configuration File
 

diff --git a/pipeline/outputs/elasticsearch.md b/pipeline/outputs/elasticsearch.md
@@ -48,7 +48,7 @@ The **es** output plugin, allows to ingest your records into an [Elasticsearch](
 | Trace\_Error | If elasticsearch return an error, print the elasticsearch API request and response \(for diag only\) | Off |
 | Current\_Time\_Index | Use current time for index generation instead of message record | Off |
 | Suppress\_Type\_Name | When enabled, mapping types is removed and `Type` option is ignored. If using Elasticsearch 8.0.0 or higher - it [no longer supports mapping types](https://www.elastic.co/guide/en/elasticsearch/reference/current/removal-of-types.html), so it shall be set to On. | Off |
-| Workers | Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0. | 2 |
+| Workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `2` |
 
 > The parameters _index_ and _type_ can be confusing if you are new to Elastic, if you have used a common relational database before, they can be compared to the _database_ and _table_ concepts. Also see [the FAQ below](elasticsearch.md#faq)
 

diff --git a/pipeline/outputs/file.md b/pipeline/outputs/file.md
@@ -12,7 +12,7 @@ The plugin supports the following configuration parameters:
 | File | Set file name to store the records. If not set, the file name will be the _tag_ associated with the records. |
 | Format | The format of the file content. See also Format section. Default: out\_file. |
 | Mkdir | Recursively create output directory if it does not exist. Permissions set to 0755. |
-| Workers | Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0. | 1 |
+| Workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `1` |
 
 ## Format
 
@@ -111,4 +111,3 @@ In your main configuration file append the following Input & Output sections:
     Match *
     Path output_dir
 ```
-
diff --git a/pipeline/outputs/firehose.md b/pipeline/outputs/firehose.md
@@ -28,6 +28,7 @@ See [here](https://github.com/fluent/fluent-bit-docs/tree/43c4fe134611da471e706b
 | auto\_retry\_requests | Immediately retry failed requests to AWS services once. This option does not affect the normal Fluent Bit retry mechanism with backoff. Instead, it enables an immediate retry with no delay for networking errors, which may help improve throughput when there are transient/random networking issues. This option defaults to `true`. |
 | external\_id | Specify an external ID for the STS API, can be used with the role_arn parameter if your role requires an external ID. |
 | profile | AWS profile name to use. Defaults to `default`. |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. Default: `1`. |
 
 ## Getting Started
 
@@ -132,4 +133,3 @@ aws ssm get-parameters-by-path --path /aws/service/aws-for-fluent-bit/
 ```
 
 For more see [the AWS for Fluent Bit github repo](https://github.com/aws/aws-for-fluent-bit#public-images).
-
diff --git a/pipeline/outputs/flowcounter.md b/pipeline/outputs/flowcounter.md
@@ -9,6 +9,7 @@ The plugin supports the following configuration parameters:
 | Key | Description | Default |
 | :--- | :--- | :--- |
 | Unit | The unit of duration. \(second/minute/hour/day\) | minute |
+| Workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
 
 ## Getting Started
 
@@ -42,7 +43,7 @@ In your main configuration file append the following Input & Output sections:
 Once Fluent Bit is running, you will see the reports in the output interface similar to this:
 
 ```bash
-$ fluent-bit -i cpu -o flowcounter  
+$ fluent-bit -i cpu -o flowcounter
 Fluent Bit v1.x.x
 * Copyright (C) 2019-2020 The Fluent Bit Authors
 * Copyright (C) 2015-2018 Treasure Data
@@ -52,4 +53,3 @@ Fluent Bit v1.x.x
 [2016/12/23 11:01:20] [ info] [engine] started
 [out_flowcounter] cpu.0:[1482458540, {"counts":60, "bytes":7560, "counts/minute":1, "bytes/minute":126 }]
 ```
-
diff --git a/pipeline/outputs/forward.md b/pipeline/outputs/forward.md
@@ -23,7 +23,7 @@ The following parameters are mandatory for either Forward for Secure Forward mod
 | Send_options         | Always send options (with "size"=count of messages)                                                                                                                                                                                                                                                 | False     |
 | Require_ack_response | Send "chunk"-option and wait for "ack" response from server. Enables at-least-once and receiving server can control rate of traffic. (Requires Fluentd v0.14.0+ server)                                                                                                                             | False     |
 | Compress             | Set to 'gzip' to enable gzip compression. Incompatible with `Time_as_Integer=True` and tags set dynamically using the [Rewrite Tag](../filters/rewrite-tag.md) filter. Requires Fluentd server v0.14.7 or later. |  _none_  |
-| Workers | Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0. | 2 |
+| Workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `2` |
 
 ## Secure Forward Mode Configuration Parameters
 

diff --git a/pipeline/outputs/gelf.md b/pipeline/outputs/gelf.md
@@ -22,6 +22,7 @@ According to [GELF Payload Specification](https://go2docs.graylog.org/5-0/gettin
 | Gelf_Level_Key         | Key to be used as the log level. Its value must be in [standard syslog levels](https://en.wikipedia.org/wiki/Syslog#Severity_level) (between 0 and 7). (_Optional in GELF_) | level         |
 | Packet_Size            | If transport protocol is `udp`, you can set the size of packets to be sent.                                                                                                 | 1420          |
 | Compress               | If transport protocol is `udp`, you can set this if you want your UDP packets to be compressed.                                                                             | true          |
+| Workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
 
 ### TLS / SSL
 

diff --git a/pipeline/outputs/http.md b/pipeline/outputs/http.md
@@ -33,7 +33,7 @@ The **http** output plugin allows to flush your records into a HTTP endpoint. Fo
 | gelf\_level\_key           | Specify the key to use for the `level` in _gelf_ format                                                                                                                                                                                                                                                                            |           |
 | body\_key                  | Specify the key to use as the body of the request (must prefix with "$"). The key must contain either a binary or raw string, and the content type can be specified using headers\_key (which must be passed whenever body\_key is present). When this option is present, each msgpack record will create a separate request.      |           |
 | headers\_key               | Specify the key to use as the headers of the request (must prefix with "$"). The key must contain a map, which will have the contents merged on the request headers. This can be used for many purposes, such as specifying the content-type of the data contained in body\_key.                                                   |           |
-| Workers | Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0. | 2 |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `2` |
 
 ### TLS / SSL