The stats
metrics aggregation is a simple aggregation that computes a few statistics over numeric values. These values can be either extracted from numeric fields or generated by script. See the Elasticsearch documentation on stats metric aggregation.
The following stats are returned
min
- minimum valuemax
- maximum valuesum
- sum of all valuescount
- number of extracted valuesavg
- average mean value
Let's build a new index with a mapping that could be used to store logs from a web service.
✅ Create a new index named stats_aggs
with the following mapping.
curl -X PUT 'http://localhost:9200/stats_aggs' -H 'Content-Type: application/json' -d '{
"mappings": {
"properties": {
"response_time_in_ms": {
"type": "integer"
},
"status_code": {
"type": "keyword"
},
"url": {
"type": "text"
}
}
}
}'
For the purpose of the examples let's assume this index mapping stores status_code
, url
and response_time_in_ms
.
The mapping contains the following fields:
url
using the field type textstatus_code
using field type keywordresponse_time_in_ms
using field type integer
Let's add a few documents that represent web service responses.
✅ Bulk upload documents to index stats_aggs
curl -H 'Content-Type: application/x-ndjson' -X POST 'http://localhost:9200/stats_aggs/_bulk' -d '
{"index":{"_index":"stats_aggs"}}
{"url": "http://example.com/1", "status_code": 200, "response_time_in_ms": 50 }
{"index":{"_index":"stats_aggs"}}
{"url": "http://example.com/1", "status_code": 200, "response_time_in_ms": 25 }
{"index":{"_index":"stats_aggs"}}
{"url": "http://example.com/1", "status_code": 200, "response_time_in_ms": 30 }
{"index":{"_index":"stats_aggs"}}
{"url": "http://example.com/1", "status_code": 200, "response_time_in_ms": 100 }
{"index":{"_index":"stats_aggs"}}
{"url": "http://example.com/1", "status_code": 200, "response_time_in_ms": 5 }
{"index":{"_index":"stats_aggs"}}
{"url": "http://example.com/1", "status_code": 200, "response_time_in_ms": 15 }
{"index":{"_index":"stats_aggs"}}
{"url": "http://example.com/1", "status_code": 200, "response_time_in_ms": 18 }
{"index":{"_index":"stats_aggs"}}
{"url": "http://example.com/1", "status_code": 200, "response_time_in_ms": 25 }
{"index":{"_index":"stats_aggs"}}
{"url": "http://example.com/redirect", "status_code": 302, "response_time_in_ms": 25 }
{"index":{"_index":"stats_aggs"}}
{"url": "http://example.com/2", "status_code": 201, "response_time_in_ms": 25 }
{"index":{"_index":"stats_aggs"}}
{"url": "http://example.com/2", "status_code": 201, "response_time_in_ms": 35 }
{"index":{"_index":"stats_aggs"}}
{"url": "http://example.com/2", "status_code": 201, "response_time_in_ms": 12 }
{"index":{"_index":"stats_aggs"}}
{"url": "http://example.com/2", "status_code": 500 }
{"index":{"_index":"stats_aggs"}}
{"url": "http://example.com/2", "status_code": 500 }
'
✅ Build a stats
aggregation query on field response_time_in_ms
.
A solution
The following query uses a stats
aggregation named response_stats
.
curl -X POST 'http://localhost:9200/stats_aggs/_search?pretty' -H 'Content-Type: application/json' -d '{
"size": 0,
"aggs": {
"response_stats": {
"stats": {
"field": "response_time_in_ms"
}
}
}
}'
What are good uses cases for the stats
aggregation?
In the same manner there exists an extended stats aggregation, see the Elasticsearch documentation on extended stats aggregation. It's basically the same but provides even more statistics on the numeric field.
✅ Build the same aggregation using extended_stats
on field response_time_in_ms
.
A solution
The following query uses a extended_stats
aggregation named response_stats
.
curl -X POST 'http://localhost:9200/stats_aggs/_search?pretty' -H 'Content-Type: application/json' -d '{
"size": 0,
"aggs": {
"response_stats": {
"extended_stats": {
"field": "response_time_in_ms"
}
}
}
}'
This contains further statistical information, such as variance, standard deviation.
Depending on the Elasticsearch version the
extended_stats
aggregation can return different fields (ES 7.8 vs 7.9), therefore check the documentation.