This is the Data Prepper Elasticsearch sink plugin that sends records to Elasticsearch cluster via REST client. You can use the sink to send data to Amazon Elasticsearch Service or Opendistro for Elasticsearch.
The Elasticsearch sink should be configured as part of Data Prepper pipeline yaml file.
pipeline:
...
sink:
elasticsearch:
hosts: ["https://localhost:9200"]
cert: path/to/cert
username: YOUR_USERNAME_HERE
password: YOUR_PASSWORD_HERE
trace_analytics_raw: true
dlq_file: /your/local/dlq-file
bulk_size: 4
The elasticsearch sink will reserve otel-v1-apm-span-*
as index pattern and otel-v1-apm-span
as index alias for record ingestion.
pipeline:
...
sink:
elasticsearch:
hosts: ["https://localhost:9200"]
cert: path/to/cert
username: YOUR_USERNAME_HERE
password: YOUR_PASSWORD_HERE
trace_analytics_service_map: true
dlq_file: /your/local/dlq-file
bulk_size: 4
The elasticsearch sink will reserve otel-v1-apm-service-map
as index for record ingestion.
The elasticsearch sink can also be configured for Amazon Elasticsearch Service domain. See security for details.
pipeline:
...
sink:
elasticsearch:
hosts: ["https://your-amazon-elasticssearch-service-endpoint"]
aws_sigv4: true
cert: path/to/cert
insecure: false
trace_analytics_service_map: true
bulk_size: 4
-
hosts
: A list of IP addresses of elasticsearch nodes. -
cert
(optional): CA certificate that is pem encoded. Accepts both .pem or .crt. This enables the client to trust the CA that has signed the certificate that ODFE is using. Default is null. -
aws_sigv4
: A boolean flag to sign the HTTP request with AWS credentials. Only applies to Amazon Elasticsearch Service. See security for details. Default tofalse
. -
aws_region
: A String represents the region of Amazon Elasticsearch Service domain, e.g. us-west-2. Only applies to Amazon Elasticsearch Service. Defaults tous-east-1
. -
aws_sts_role_arn
: A IAM role arn which the sink plugin will assume to sign request to Amazon Elasticsearch. If not provided the plugin will use the default credentials. -
insecure
: A boolean flag to turn off SSL certificate verification. If set to true, CA certificate verification will be turned off and insecure HTTP requests will be sent. Default tofalse
. -
username
(optional): A String of username used in the internal users of ODFE cluster. Default is null. -
password
(optional): A String of password used in the internal users of ODFE cluster. Default is null. -
trace_analytics_raw
(optional): A boolean flag indicates APM trace analytics raw span data type. e.g.
{
"traceId":"bQ/2NNEmtuwsGAOR5ntCNw==",
"spanId":"mnO/qUT5ye4=",
"name":"io.opentelemetry.auto.servlet-3.0",
"kind":"SERVER",
"status":{},
"startTime":"2020-08-20T05:40:46.041011600Z",
"endTime":"2020-08-20T05:40:46.089556800Z",
...
}
Default value is false. Set it to true for Raw span trace analytics. Set it to false for Service map trace analytics.
trace_analytics_service_map
(optional): A boolean flag indicates APM trace analytics service map data type. e.g.
{
"hashId": "aQ/2NNEmtuwsGAOR5ntCNwk=",
"serviceName": "Payment",
"kind": "Client",
"target":
{
"domain": "Purchase",
"resource": "Buy"
},
"destination":
{
"domain": "Purchase",
"resource": "Buy"
},
"traceGroupName": "MakePayement.auto"
}
Default value is false. Set it to true for Service map trace analytics. Set it to false for Raw span trace analytics.
-
index
: A String used as index name for custom data type. Applicable and required only If bothtrace_analytics_raw
andtrace_analytics_service_map
are set to false. -
template_file
(optional): A json file path to be read as index template for custom data ingestion. The json file content should be the json value of"template"
key in the json content of elasticsearch Index templates API, e.g. otel-v1-apm-span-index-template.json -
dlq_file
(optional): A String of absolute file path for DLQ failed output records. Defaults to null. If not provided, failed records will be written into the default data-prepper log file (logs/Data-Prepper.log
). -
bulk_size
(optional): A long of bulk size in bulk requests in MB. Default to 5 MB. If set to be less than 0, all the records received from the upstream prepper at a time will be sent as a single bulk request. If a single record turns out to be larger than the set bulk size, it will be sent as a bulk request of a single document.
Besides common metrics in AbstractSink, elasticsearch sink introduces the following custom metrics.
bulkRequestLatency
: measures latency of sending each bulk request including retries.
bulkRequestErrors
: measures number of errors encountered in sending bulk requests.documentsSuccess
: measures number of documents successfully sent to ES by bulk requests including retries.documentsSuccessFirstAttempt
: measures number of documents successfully sent to ES by bulk requests on first attempt.documentErrors
: measures number of documents failed to be sent by bulk requests.
This plugin is compatible with Java 8. See