One multi-value metrics aggregation is the percentiles
aggregation (see the Elasticsearch documentation).
This aggregation is useful to show a distribution of values that do not follow a normal distribution. It's useful when the distribution follows a power law distribution. A practical example is finding the slowest response times of a web service, in particular to find response times above 95%, 99% etc. For this type of distribution the normal statistics do not offer useful insights, an average response time of 50ms
may hide the fact that there are outliers above 2.000ms
.
Let's build a new index with the following mapping that could be used to store logs from a web service.
✅ Create a new index named percentiles_aggs
with the following mapping.
curl -X PUT 'http://localhost:9200/percentiles_aggs' -H 'Content-Type: application/json' -d '{
"mappings": {
"properties": {
"response_time_in_ms": {
"type": "integer"
},
"status_code": {
"type": "keyword"
},
"url": {
"type": "text"
}
}
}
}'
The mapping contains the following fields:
url
using the field type textstatus_code
using field type keywordresponse_time_in_ms
using field type integer
Let's add a few documents that represent web service responses (same as in the stats aggregation example).
✅ Bulk upload documents to index percentiles_aggs
curl -H 'Content-Type: application/x-ndjson' -X POST 'http://localhost:9200/percentiles_aggs/_bulk' -d '
{"index":{"_index":"percentiles_aggs"}}
{"url": "http://example.com/1", "status_code": 200, "response_time_in_ms": 50 }
{"index":{"_index":"percentiles_aggs"}}
{"url": "http://example.com/1", "status_code": 200, "response_time_in_ms": 25 }
{"index":{"_index":"percentiles_aggs"}}
{"url": "http://example.com/1", "status_code": 200, "response_time_in_ms": 30 }
{"index":{"_index":"percentiles_aggs"}}
{"url": "http://example.com/1", "status_code": 200, "response_time_in_ms": 100 }
{"index":{"_index":"percentiles_aggs"}}
{"url": "http://example.com/1", "status_code": 200, "response_time_in_ms": 5 }
{"index":{"_index":"percentiles_aggs"}}
{"url": "http://example.com/1", "status_code": 200, "response_time_in_ms": 15 }
{"index":{"_index":"percentiles_aggs"}}
{"url": "http://example.com/1", "status_code": 200, "response_time_in_ms": 18 }
{"index":{"_index":"percentiles_aggs"}}
{"url": "http://example.com/1", "status_code": 200, "response_time_in_ms": 25 }
{"index":{"_index":"percentiles_aggs"}}
{"url": "http://example.com/redirect", "status_code": 302, "response_time_in_ms": 25 }
{"index":{"_index":"percentiles_aggs"}}
{"url": "http://example.com/2", "status_code": 201, "response_time_in_ms": 25 }
{"index":{"_index":"percentiles_aggs"}}
{"url": "http://example.com/2", "status_code": 201, "response_time_in_ms": 35 }
{"index":{"_index":"percentiles_aggs"}}
{"url": "http://example.com/2", "status_code": 201, "response_time_in_ms": 12 }
{"index":{"_index":"percentiles_aggs"}}
{"url": "http://example.com/2", "status_code": 500 }
{"index":{"_index":"percentiles_aggs"}}
{"url": "http://example.com/2", "status_code": 500 }
'
See the Elasticsearch documentation on percentiles
aggregation.
✅ Build a percentiles
aggregation query on field response_time_in_ms
.
A solution
The following query uses a percentiles
aggregation named response_times
.
curl -X POST 'http://localhost:9200/stats_aggs/_search?pretty' -H 'Content-Type: application/json' -d '{
"size": 0,
"aggs": {
"response_times": {
"percentiles": {
"field": "response_time_in_ms"
}
}
}
}'
The output looks like this:
{
...
"aggregations" : {
"response_times" : {
"values" : {
"1.0" : 5.0,
"5.0" : 5.700000000000001,
"25.0" : 16.5,
"50.0" : 25.0,
"75.0" : 32.5,
"95.0" : 94.99999999999993,
"99.0" : 100.0
}
}
}
}
How do you interpret these values?
The aggregation supports the field
percents
to provide a list of float values the distribution is grouped into, e.g.[25, 50, 95, 99, 99.9]
.