New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

test: Test and document histogram latency metrics #7694

Merged

yinggeh merged 16 commits into main from DLIS-7383-yinggeh-metrics-standardization-TTFT

Oct 23, 2024

Contributor

yinggeh commented Oct 12, 2024 •

edited

Loading

What does the PR do?

The PR adds tests to histogram metrics and new nv_inference_first_response_histogram_ms.

Verify this metric is only created in decoupled models.
Tests --metrics-config histogram_latencies=<bool>.
Tests data are correct in a ensemble decoupled model.

Checklist

Commit Type:

test

Related PRs:

triton-inference-server/core#396

Where should the reviewer start?

Check the implementation PR first.

Test plan:

L0_metrics--base
L0_response_cache--base

CI Pipeline ID:
19614087

Background

Standardizing Large Model Server Metrics in Kubernetes


          Add tests for histogram metrics (decoupled)

892805f

yinggeh added the PR: test label

yinggeh requested review from indrajit96, nnshah1, kthui, rmccorm4 and GuanLuo

October 12, 2024 00:12

yinggeh self-assigned this

yinggeh mentioned this pull request

feat: Add histogram support and TTFT histogram metric triton-inference-server/core#396

Merged

11 tasks

rmccorm4 changed the title ~~test: Tests new histogram metric~~ test: Test and document histogram latency metrics

rmccorm4 added the PR: docs label


          Update model without torch dependency

7e946ae

kthui previously approved these changes

View reviewed changes

docs/user_guide/metrics.md Outdated Show resolved Hide resolved

yinggeh dismissed kthui’s stale review via

d663da5

October 16, 2024 23:00


          Update doc

e084067

yinggeh force-pushed the DLIS-7383-yinggeh-metrics-standardization-TTFT branch from d663da5 to e084067 Compare

October 16, 2024 23:03

rmccorm4 reviewed

View reviewed changes

qa/L0_metrics/metrics_config_test.py Show resolved Hide resolved

rmccorm4 reviewed

View reviewed changes

qa/L0_metrics/metrics_config_test.py Outdated Show resolved Hide resolved

yinggeh added 3 commits

October 17, 2024 11:10


          Disable histograms by default

40147f8


          Test histogram data of ensemble decoupled model

056437f


          Merge branch 'main' of github.com:triton-inference-server/server into…

7d9696f

… DLIS-7383-yinggeh-metrics-standardization-TTFT

yinggeh requested a review from kthui

October 18, 2024 17:45

yinggeh added 2 commits

October 18, 2024 12:37


          Replace unittest with test_util

87caf2c


          Merge branch 'main' of github.com:triton-inference-server/server into…

7331d4f

… DLIS-7383-yinggeh-metrics-standardization-TTFT

kthui reviewed

View reviewed changes

qa/L0_metrics/ensemble_decoupled/ensemble/config.pbtxt Outdated Show resolved Hide resolved

qa/L0_metrics/histogram_metrics_test.py Outdated Show resolved Hide resolved

qa/L0_metrics/histogram_metrics_test.py Outdated Show resolved Hide resolved

qa/L0_metrics/test.sh Outdated Show resolved Hide resolved

qa/L0_metrics/histogram_metrics_test.py Outdated Show resolved Hide resolved

qa/L0_metrics/histogram_metrics_test.py Outdated Show resolved Hide resolved


          Address comments

eff3ce1

yinggeh requested a review from kthui

October 19, 2024 00:52

kthui reviewed

View reviewed changes

qa/L0_metrics/histogram_metrics_test.py Outdated Show resolved Hide resolved

yinggeh added 2 commits

October 18, 2024 18:35


          Minor updates


          Allow multiple response per request. Add test for non-decoupled model…

3c008c8

… metrics.

yinggeh requested review from kthui and rmccorm4

October 21, 2024 09:50

kthui reviewed

View reviewed changes

Contributor

kthui left a comment

Make sure to rebase on main that has the compute capability fix merged.

qa/L0_metrics/ensemble_decoupled/async_execute/1/model.py Outdated Show resolved Hide resolved

qa/L0_metrics/ensemble_decoupled/async_execute/1/model.py Outdated Show resolved Hide resolved

qa/L0_metrics/ensemble_decoupled/async_execute/1/model.py Outdated Show resolved Hide resolved

qa/L0_metrics/ensemble_decoupled/async_execute/1/model.py Outdated Show resolved Hide resolved

qa/L0_metrics/ensemble_decoupled/async_execute_decouple/1/model.py Outdated Show resolved Hide resolved

kthui reviewed

View reviewed changes

qa/L0_metrics/histogram_metrics_test.py Outdated Show resolved Hide resolved


          Merge branch 'main' of github.com:triton-inference-server/server into…

aa9ddc7

… DLIS-7383-yinggeh-metrics-standardization-TTFT

yinggeh force-pushed the DLIS-7383-yinggeh-metrics-standardization-TTFT branch from 9f5b2ab to aa9ddc7 Compare

October 21, 2024 21:30


          Simplify test models

04d72cc

yinggeh requested a review from kthui

October 21, 2024 23:20

kthui reviewed

View reviewed changes

Contributor

kthui left a comment

Nice work adding a more comprehensive test!

Only two things left:

Verify the number of response(s) returned on each inference matches the number expected.
Verify the histogram bucket key and value pairs after each inference.

qa/L0_metrics/histogram_metrics_test.py Outdated Show resolved Hide resolved

qa/L0_metrics/histogram_metrics_test.py Outdated Show resolved Hide resolved

qa/L0_metrics/histogram_metrics_test.py Outdated Show resolved Hide resolved

qa/L0_metrics/ensemble_decoupled/async_execute_decouple/1/model.py Show resolved Hide resolved

qa/L0_metrics/histogram_metrics_test.py Outdated Show resolved Hide resolved

qa/L0_metrics/histogram_metrics_test.py Outdated Show resolved Hide resolved


          Address comments

4b326ed

yinggeh requested a review from kthui

October 22, 2024 16:54

github-advanced-security bot found potential problems

View reviewed changes

qa/L0_metrics/histogram_metrics_test.py Fixed Show resolved Hide resolved


          Remove time

68c5d8e

kthui reviewed

View reviewed changes

qa/L0_metrics/metrics_config_test.py

Comment on lines +101 to +106

+                  # Histograms
+                  def test_inf_histograms_decoupled_exist(self):
+                      metrics = self._get_metrics()
+                      for metric in INF_HISTOGRAM_DECOUPLED_PATTERNS:
+                          for suffix in ["_count", "_sum", ""]:
+                              self.assertIn(metric + suffix, metrics)

Contributor

kthui Oct 22, 2024

Nice work adding checks that the count, sum and bucket metrics exist!

This does not check if the value on each bucket is correct, but you mentioned there is an existing test that will verify the value is correct, so you are not verifying the value here because the existing test will verify the Prometheus histogram metrics is functioning correctly and the tests on histogram_metrics_test.py is verifying the numbers provided to Prometheus is correct via count and sum.

Contributor Author

yinggeh Oct 22, 2024

Since this test only checks if specific histograms exist without updating the metrics, all values should be 0. For tests regarding Prometheus metrics APIs and functionalities, please refer to
https://github.com/triton-inference-server/core/blob/main/src/test/metrics_api_test.cc

Collaborator

rmccorm4 Oct 22, 2024

# Prometheus histogram buckets are tested in metrics_api_test.cc::HistogramAPIHelper

Just a nit for clarification: The Metrics API tests (ex: TRITONSERVER_MetricObserve) aren't going through the same code path that the built-in latency metrics are going through. So although there are custom metrics tests testing histograms, they aren't necessarily testing the built-in histogram latency metrics added here.

The built-in metrics actually just use prometheus APIs directly and some C++ helper functions around them today. Ideally, we would unify these to all use the TRITONSERVER_Metric layer in the same way for both built-in metrics and custom metrics for easier test coverage and maintenance in the future.

All that being said, I think the current tests included here with the checks around the _sum value are fine for now for this PR. When adding other histogram latency metrics, I think it would be a good idea to either (a) add some bucket-related tests to these python unit tests or (b) attempt to unify the internal metrics to use the same custom metrics APIs or internal C++ classes around them for better re-use of the metrics_api_test.cc tests.

kthui previously approved these changes

View reviewed changes

rmccorm4 reviewed

View reviewed changes

docs/user_guide/metrics.md Outdated Show resolved Hide resolved

rmccorm4 reviewed

View reviewed changes

qa/L0_metrics/test.sh

+              cp ../python_models/${decoupled_model_name}/model.py ${MODELDIR}/${decoupled_model_name}/1/
+              cp ../python_models/${decoupled_model_name}/config.pbtxt ${MODELDIR}/${decoupled_model_name}/
+              SERVER_ARGS="${BASE_SERVER_ARGS} --load-model=${decoupled_model_name}"

Collaborator

rmccorm4 Oct 22, 2024

nit for future follow-up (not this PR), a lot of this could probably be condensed with a for loop.

rmccorm4 reviewed

View reviewed changes

Collaborator

rmccorm4 left a comment •

edited

Loading

Nice work! Only a minor comment on metrics.md for this PR, but LGTM otherwise.


          Update doc

f08e051

yinggeh dismissed kthui’s stale review via

f08e051

October 22, 2024 21:52

yinggeh requested review from rmccorm4 and kthui

October 22, 2024 21:54

rmccorm4 approved these changes

View reviewed changes

yinggeh merged commit ff1c674 into main

3 checks passed

yinggeh deleted the DLIS-7383-yinggeh-metrics-standardization-TTFT branch

October 23, 2024 10:57

yinggeh restored the DLIS-7383-yinggeh-metrics-standardization-TTFT branch

October 23, 2024 10:57

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

rmccorm4 rmccorm4 approved these changes

indrajit96 Awaiting requested review from indrajit96

nnshah1 Awaiting requested review from nnshah1

GuanLuo Awaiting requested review from GuanLuo

kthui Awaiting requested review from kthui

Labels

PR: docs PR: test