-
Notifications
You must be signed in to change notification settings - Fork 435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[elastic_agent] Add additional output metrics to dashboards #8834
[elastic_agent] Add additional output metrics to dashboards #8834
Conversation
- beat.stats.libbeat.output.events.failed - beat.stats.libbeat.output.events.toomany - beat.stats.libbeat.output.events.dropped - beat.stats.libbeat.output.batches.split All four of these are useful to see if your agent is having non-fatal errors when using the output which can result in degraded performance.
ea690d4
to
9120e2f
Compare
Pinging @elastic/elastic-agent (Team:Elastic-Agent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments even though I approve:
- I find it a bit weird they are all 0 if you are getting data in, was it possible to test that the graphs are actually reading the stats correctly?
- These are metrics from elastic_agent? Because you only modified elastic_agent_metrics datastream, but there are other metrics data_streams inside the elastic_agent package, like filebeat_metrics etc?
- I would add a filter to exclude
(empty)
from these, there should be an option to exclude null values I believe?
Thanks for catching this, these changes should be applied to each individual Beat metrics data stream. The agent itself doesn't have output metrics. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above, these should be on the Beat metrics data streams.
@leehinman thanks for these dashboards. Is there any chance we could also add the recent output latency metrics (histogram) that were added? (elastic/beats#37258). If not I can create a new issue. thanks |
yep, I'll add it. When I started the latency histogram wasn't done yet. |
these are error conditions, dropped, failed, toomany & split. So under normal operation these should be 0. If they go up then back to 0, you have a transient error you should look at, and if they stay up well, then you have a persistent problem to investigate. I'll see if I can induce some errors to get these populated.
So this brings up an interesting question. I cloned the existing "Total events rate /s" panel. Does this mean that the Total events rate doesn't include everything? They both use the filter
I didn't see a filter to exclude empty in Lens, but I did find visual option to use a straight line when no data, which looks more like I would expect. |
Hi! We just realized that we haven't looked into this PR in a while. We're sorry! We're labeling this issue as |
Hi! This PR has been stale for a while and we're going to close it as part of our cleanup procedure. We appreciate your contribution and would like to apologize if we have not been able to review it, due to the current heavy load of the team. Feel free to re-open this PR if you think it should stay open and is worth rebasing. Thank you for your contribution! |
Proposed commit message
Add dashboards for the following metrics to the
[Elastic Agent] Agent metrics
Dashboard.All four of these are useful to see if your agent is having non-fatal errors when using the output which can result in degraded performance.
Checklist
changelog.yml
file.Author's Checklist
How to test this PR locally
cd packages/elastic_agent
elastic-package stack up --version <8.11.2 or greater> -d
View the
[Elastic Agent] Agent metrics
DashboardRelated issues
Screenshots