-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collect telemetry metrics from Triton metrics endpoint #26
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice job with this.
I want to go through some of it with you but this is a great step towards wrapping this up and getting this out to our users. Thanks for working so hard on this.
genai-perf/genai_perf/telemetry_data/telemetry_data_collector.py
Outdated
Show resolved
Hide resolved
genai-perf/genai_perf/telemetry_data/telemetry_data_collector.py
Outdated
Show resolved
Hide resolved
genai-perf/genai_perf/telemetry_data/telemetry_data_collector.py
Outdated
Show resolved
Hide resolved
genai-perf/genai_perf/telemetry_data/telemetry_data_collector.py
Outdated
Show resolved
Hide resolved
genai-perf/genai_perf/telemetry_data/telemetry_data_collector.py
Outdated
Show resolved
Hide resolved
genai-perf/genai_perf/telemetry_data/telemetry_data_collector.py
Outdated
Show resolved
Hide resolved
genai-perf/genai_perf/telemetry_data/triton_telemetry_data_collector.py
Outdated
Show resolved
Hide resolved
genai-perf/genai_perf/telemetry_data/telemetry_data_collector.py
Outdated
Show resolved
Hide resolved
genai-perf/genai_perf/telemetry_data/telemetry_data_collector.py
Outdated
Show resolved
Hide resolved
genai-perf/genai_perf/telemetry_data/triton_telemetry_data_collector.py
Outdated
Show resolved
Hide resolved
genai-perf/genai_perf/telemetry_data/triton_telemetry_data_collector.py
Outdated
Show resolved
Hide resolved
genai-perf/genai_perf/telemetry_data/telemetry_data_collector.py
Outdated
Show resolved
Hide resolved
genai-perf/genai_perf/telemetry_data/telemetry_data_collector.py
Outdated
Show resolved
Hide resolved
1b464c9
to
e4da216
Compare
45480cf
to
dcd0e4a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent work, Harshini!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wait to merge until @nv-hwoo closes his comments.
Nice work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work @lkomali! Thanks for adding such a detailed doc string 👍 And sorry for getting back so late 😅 All looks good with just one small comment about the optional argument for TelemetryDataCollector
in the run
function.
f85bac3
to
8251542
Compare
* Collect telemetry metrics from Triton metrics endpoint * Remove one of the print statements * Fix comments * Fix pre-commit errors * Fix test errors * Add unit tests and fix code * Fix pre-commit error * Fix codeql warnings * Fix comments
This PR is a part of adding telemetry metrics to GenAI-perf.
Changes implemented in this PR:
This is how the metrics are stored.
For every metric, the values are stored as a list of lists.
The outer list represents the sequence of metric measurements over time.
Each inner list contains the metric values for each GPU at a particular time point.