You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
In our multi-tenanted setup, we frequently have teams using high cardinality loki labels and not following labelling best practices.
Describe the solution you'd like
While this output is exposed in logcli via something like logcli series '{}' --since=1h --analyze-labels, my proposal is to expose this data under a metrics endpoint.
It allows:
Users to quickly view this data in a dashboard and adjust their labelling practices
A historic view on cardinality counts
Alerting on high cardinality counts, for either us (the platform team) or the users
I think giving some sort of out-of-the-box functionality (vs getting users to logcli their way to view this data) will be beneficial to users and platforms teams for monitoring usage.
Describe alternatives you've considered
Writing our own service that calls the logcli endpoint, parses the output and exposes as metrics. This isn't hard to do, but if there's an opportunity to contribute to the upstream project for others, why not?
Even just giving an option to expose the output of the --analyze-labels as json would be better, for parsing reasons
The text was updated successfully, but these errors were encountered:
Just to give some more thought on what I'm looking to do - create counter loki_stream_label_value_count with label_name as label and the number of unique values as the value. Haven't really looked at where it would make sense to do it here but if I was running as a standalone service I'd probably just query intervals of 5 mins and aggregate there. My only concern is that this could be contributing to the cardinality issue if there many label names - why it might make more sense to just expose an endpoint, like Mimir does and use some JSON-parsing datasource to visualise this data in Grafana.
Is your feature request related to a problem? Please describe.
In our multi-tenanted setup, we frequently have teams using high cardinality loki labels and not following labelling best practices.
Describe the solution you'd like
While this output is exposed in logcli via something like
logcli series '{}' --since=1h --analyze-labels
, my proposal is to expose this data under a metrics endpoint.It allows:
I think giving some sort of out-of-the-box functionality (vs getting users to logcli their way to view this data) will be beneficial to users and platforms teams for monitoring usage.
Describe alternatives you've considered
Writing our own service that calls the logcli endpoint, parses the output and exposes as metrics. This isn't hard to do, but if there's an opportunity to contribute to the upstream project for others, why not?
Even just giving an option to expose the output of the --analyze-labels as json would be better, for parsing reasons
The text was updated successfully, but these errors were encountered: