-
Notifications
You must be signed in to change notification settings - Fork 870
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Adding Prometheus metrics framework (#460)
* Test Prometheus client with sample metrics * Updated documentation and additional labels for model metrics * Updated inference unit tests to check for metrics; Added openapi docs * Addressed review comments * Added POSTMAN regression tests for metrics * Update connecter types * Bumping serving-sdk dependency * Dropping metrics plugins support for now * Fixed erroneous test cases * added flag to enable or disable prometheous metric api * Adding missing file * doc changes and fixed typo * Update configuration.md made section for new params related to metric * Update metrics_api.md * Update MetricAggregator.java Fixed constant name Co-authored-by: dhaniram-kshirsagar <[email protected]> Co-authored-by: dhaniram kshirsagar <[email protected]>
- Loading branch information
1 parent
4a67a43
commit b89d1ca
Showing
41 changed files
with
907 additions
and
144 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
# Metrics API | ||
|
||
Metrics API is listening on port 8082 and only accessible from localhost by default. To change the default setting, see [TorchServe Configuration](configuration.md). The default metrics endpoint returns Prometheus formatted metrics. You can query metrics using curl requests or point a [Prometheus Server](#prometheus-server) to the endpoint and use [Grafana](#grafana) for dashboards. | ||
|
||
By default these APIs are enable however same can be disabled by setting `enable_metrics_api=false` in torchserve config.properties file. | ||
For details refer [Torchserve config](configuration.md) docs. | ||
|
||
```console | ||
curl http://127.0.0.1:8082/metrics | ||
|
||
# HELP ts_inference_latency_microseconds Cumulative inference duration in microseconds | ||
# TYPE ts_inference_latency_microseconds counter | ||
ts_inference_latency_microseconds{uuid="d5f84dfb-fae8-4f92-b217-2f385ca7470b",model_name="noopversioned",model_version="1.11",} 1990.348 | ||
ts_inference_latency_microseconds{uuid="d5f84dfb-fae8-4f92-b217-2f385ca7470b",model_name="noop",model_version="default",} 2032.411 | ||
# HELP ts_inference_requests_total Total number of inference requests. | ||
# TYPE ts_inference_requests_total counter | ||
ts_inference_requests_total{uuid="d5f84dfb-fae8-4f92-b217-2f385ca7470b",model_name="noopversioned",model_version="1.11",} 1.0 | ||
ts_inference_requests_total{uuid="d5f84dfb-fae8-4f92-b217-2f385ca7470b",model_name="noop",model_version="default",} 1.0 | ||
# HELP ts_queue_latency_microseconds Cumulative queue duration in microseconds | ||
# TYPE ts_queue_latency_microseconds counter | ||
ts_queue_latency_microseconds{uuid="d5f84dfb-fae8-4f92-b217-2f385ca7470b",model_name="noopversioned",model_version="1.11",} 364.884 | ||
ts_queue_latency_microseconds{uuid="d5f84dfb-fae8-4f92-b217-2f385ca7470b",model_name="noop",model_version="default",} 82.349 | ||
``` | ||
|
||
```console | ||
curl "http://127.0.0.1:8082/metrics?name[]=ts_inference_latency_microseconds&name[]=ts_queue_latency_microseconds" --globoff | ||
|
||
# HELP ts_inference_latency_microseconds Cumulative inference duration in microseconds | ||
# TYPE ts_inference_latency_microseconds counter | ||
ts_inference_latency_microseconds{uuid="d5f84dfb-fae8-4f92-b217-2f385ca7470b",model_name="noopversioned",model_version="1.11",} 1990.348 | ||
ts_inference_latency_microseconds{uuid="d5f84dfb-fae8-4f92-b217-2f385ca7470b",model_name="noop",model_version="default",} 2032.411 | ||
# HELP ts_queue_latency_microseconds Cumulative queue duration in microseconds | ||
# TYPE ts_queue_latency_microseconds counter | ||
ts_queue_latency_microseconds{uuid="d5f84dfb-fae8-4f92-b217-2f385ca7470b",model_name="noopversioned",model_version="1.11",} 364.884 | ||
ts_queue_latency_microseconds{uuid="d5f84dfb-fae8-4f92-b217-2f385ca7470b",model_name="noop",model_version="default",} 82.349 | ||
``` | ||
|
||
#### Prometheus server | ||
|
||
To view these metrics on a Prometheus server, download and install using the instructions [here](https://prometheus.io/download/#prometheus). Create a minimal `prometheus.yml` config file as below and run `./prometheus --config.file=prometheus.yml`. | ||
|
||
```yaml | ||
global: | ||
scrape_interval: 15s | ||
evaluation_interval: 15s | ||
|
||
scrape_configs: | ||
- job_name: 'prometheus' | ||
static_configs: | ||
- targets: ['localhost:9090'] | ||
- job_name: 'torchserve' | ||
static_configs: | ||
- targets: ['localhost:8082'] #TorchServe metrics endpoint | ||
``` | ||
Navigate to http://localhost:9090/ on a browser to execute queries and create graphs | ||
<img width="1231" alt="PrometheusServer" src="https://user-images.githubusercontent.com/880376/86984450-806fc680-c143-11ea-9ae2-f2ef42f24f4c.png"> | ||
#### Grafana | ||
Once you have the Torchserve and Prometheus servers running, you can further [setup](https://prometheus.io/docs/visualization/grafana/) Grafana, point it to Prometheus server and navigate to http://localhost:3000/ to create dashboards and graphs. | ||
You can use command given below to start Grafana - | ||
`sudo systemctl daemon-reload && sudo systemctl enable grafana-server && sudo systemctl start grafana-server` | ||
|
||
<img width="1220" alt="Screen Shot 2020-07-08 at 5 51 57 PM" src="https://user-images.githubusercontent.com/880376/86984550-c4fb6200-c143-11ea-9434-09d4d43dd6d4.png"> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,10 @@ | ||
org.gradle.daemon=true | ||
org.gradle.jvmargs=-Xmx1024M | ||
commons_cli_version=1.3.1 | ||
gson_version=2.8.5 | ||
prometheus_version=0.9.0 | ||
netty_version=4.1.50.Final | ||
slf4j_api_version=1.7.25 | ||
slf4j_log4j12_version=1.7.25 | ||
gson_version=2.8.5 | ||
commons_cli_version=1.3.1 | ||
testng_version=7.1.0 | ||
torchserve_sdk_version=0.0.3 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
69 changes: 69 additions & 0 deletions
69
frontend/server/src/main/java/org/pytorch/serve/http/PrometheusMetricsRequestHandler.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
package org.pytorch.serve.http; | ||
|
||
import io.netty.buffer.ByteBuf; | ||
import io.netty.buffer.ByteBufOutputStream; | ||
import io.netty.buffer.Unpooled; | ||
import io.netty.channel.ChannelHandlerContext; | ||
import io.netty.handler.codec.http.DefaultFullHttpResponse; | ||
import io.netty.handler.codec.http.FullHttpRequest; | ||
import io.netty.handler.codec.http.FullHttpResponse; | ||
import io.netty.handler.codec.http.HttpHeaderNames; | ||
import io.netty.handler.codec.http.HttpResponseStatus; | ||
import io.netty.handler.codec.http.HttpVersion; | ||
import io.netty.handler.codec.http.QueryStringDecoder; | ||
import io.prometheus.client.CollectorRegistry; | ||
import io.prometheus.client.exporter.common.TextFormat; | ||
import java.io.IOException; | ||
import java.io.OutputStream; | ||
import java.io.OutputStreamWriter; | ||
import java.io.Writer; | ||
import java.util.Collections; | ||
import java.util.HashSet; | ||
import java.util.List; | ||
import org.pytorch.serve.archive.ModelException; | ||
import org.pytorch.serve.util.NettyUtils; | ||
import org.slf4j.Logger; | ||
import org.slf4j.LoggerFactory; | ||
|
||
public class PrometheusMetricsRequestHandler extends HttpRequestHandlerChain { | ||
|
||
private static final Logger logger = | ||
LoggerFactory.getLogger(PrometheusMetricsRequestHandler.class); | ||
|
||
/** Creates a new {@code MetricsRequestHandler} instance. */ | ||
public PrometheusMetricsRequestHandler() { | ||
// TODO: Add plugins manager support | ||
} | ||
|
||
@Override | ||
protected void handleRequest( | ||
ChannelHandlerContext ctx, | ||
FullHttpRequest req, | ||
QueryStringDecoder decoder, | ||
String[] segments) | ||
throws ModelException { | ||
if (segments.length >= 2 && "metrics".equals(segments[1])) { | ||
ByteBuf resBuf = Unpooled.directBuffer(); | ||
List<String> params = | ||
decoder.parameters().getOrDefault("name[]", Collections.emptyList()); | ||
FullHttpResponse resp; | ||
try (OutputStream outputStream = new ByteBufOutputStream(resBuf); | ||
Writer writer = new OutputStreamWriter(outputStream)) { | ||
TextFormat.write004( | ||
writer, | ||
CollectorRegistry.defaultRegistry.filteredMetricFamilySamples( | ||
new HashSet<>(params))); | ||
resp = | ||
new DefaultFullHttpResponse( | ||
HttpVersion.HTTP_1_1, HttpResponseStatus.OK, resBuf); | ||
} catch (IOException e) { | ||
logger.error("Exception encountered while reporting metrics"); | ||
throw new ModelException(e.getMessage(), e); | ||
} | ||
resp.headers().set(HttpHeaderNames.CONTENT_TYPE, TextFormat.CONTENT_TYPE_004); | ||
NettyUtils.sendHttpResponse(ctx, resp, true); | ||
} else { | ||
chain.handleRequest(ctx, req, decoder, segments); | ||
} | ||
} | ||
} |
28 changes: 28 additions & 0 deletions
28
frontend/server/src/main/java/org/pytorch/serve/metrics/api/MetricAggregator.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
package org.pytorch.serve.metrics.api; | ||
|
||
import org.pytorch.serve.metrics.format.prometheous.PrometheusMetricManager; | ||
import org.pytorch.serve.util.ConfigManager; | ||
|
||
public final class MetricAggregator { | ||
|
||
private MetricAggregator() {} | ||
|
||
public static void handleInferenceMetric(final String modelName, final String modelVersion) { | ||
ConfigManager configMgr = ConfigManager.getInstance(); | ||
if (configMgr.isMetricApiEnable() | ||
&& configMgr.getMetricsFormat().equals(ConfigManager.METRIC_FORMAT_PROMETHEUS)) { | ||
PrometheusMetricManager.getInstance().incInferCount(modelName, modelVersion); | ||
} | ||
} | ||
|
||
public static void handleInferenceMetric( | ||
final String modelName, final String modelVersion, long timeInQueue, long inferTime) { | ||
ConfigManager configMgr = ConfigManager.getInstance(); | ||
if (configMgr.isMetricApiEnable() | ||
&& configMgr.getMetricsFormat().equals(ConfigManager.METRIC_FORMAT_PROMETHEUS)) { | ||
PrometheusMetricManager metrics = PrometheusMetricManager.getInstance(); | ||
metrics.incInferLatency(inferTime, modelName, modelVersion); | ||
metrics.incQueueLatency(timeInQueue, modelName, modelVersion); | ||
} | ||
} | ||
} |
Oops, something went wrong.