Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(prometheus) add wasmx metrics #13681

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

casimiro
Copy link
Contributor

@casimiro casimiro commented Sep 16, 2024

Summary

Checklist

  • The Pull Request has tests
  • A changelog file has been created under changelog/unreleased/kong or skip-changelog label added on PR if changelog is unnecessary. README.md
  • There is a user-facing docs PR against https://github.com/Kong/docs.konghq.com - PUT DOCS PR HERE

Issue reference

Fix #[issue number]

@github-actions github-actions bot added core/proxy plugins/prometheus cherry-pick kong-ee schedule this PR for cherry-picking to kong/kong-ee core/wasm Everything relevant to [proxy-]wasm labels Sep 16, 2024
@team-gateway-bot team-gateway-bot added the author/community PRs from the open-source community (not Kong Inc) label Sep 16, 2024
@casimiro casimiro added the pr/wip A work in progress PR opened to receive feedback label Sep 16, 2024
@casimiro casimiro force-pushed the feat/prometheus-wasmx-metrics branch 3 times, most recently from cf4e4c3 to d241e68 Compare September 27, 2024 17:48
@casimiro casimiro force-pushed the feat/prometheus-wasmx-metrics branch 2 times, most recently from 9d62b9a to 37604f7 Compare October 9, 2024 21:51
@casimiro casimiro force-pushed the feat/prometheus-wasmx-metrics branch from 37604f7 to b60e5bb Compare October 9, 2024 22:25
@@ -402,6 +402,8 @@ local function rebuild_state(db, version, old_state)

for _, filter in ipairs(chain.filters) do
if filter.enabled then
_M.filters_by_name[filter.name].config = cjson_decode(filter.config) or filter.config

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proxy-wasm spec for metrics doesn't provide a way for filter developers to define a metric associated with a set of labels.

The function proxy_define_metric expects only a metric type and a string representing the metric name.

As a result, filter developers include whatever label associated with a metric as part of its name, e.g. a_metric_label1="label_value"_label2="label_value".

There isn't however a consensus over how to represent labels in metric names.

The functionality of parsing labels proposed and implemented in the function parse_pw_key below, reflects the one available in Envoy and used by Coraza WAF Proxy-wasm filter.

It expects filter developers to provide a set of regex patterns matching labels in a metric's name, as part of the filter configuration:

    label_patterns = {
      { label = "service", pattern = "(_s_id=([0-9a-z%-]+))" },
      { label = "route", pattern = "(_r_id=([0-9a-z%-]+))" },
    }

Then, during serialization, those patterns are match with metric names to extract potential labels.

As such, we need to have access to the filter configuration when serializing Proxy-Wasm metrics.

@@ -1,11 +1,14 @@
local balancer = require "kong.runloop.balancer"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: we should probably try to be consistent with the require syntax, in this file it seems we are using round brackets around quotes all the time i.e. here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I noticed the inconsistencies in this module but I opted for not addressing them to keep the PR simple. I think it might be a good idea to address those inconsistencies and style issues collectively in a separate PR -- which would be easier to review.

@@ -517,6 +520,7 @@ local function metric_data(write_fn)
-- notify the function if prometheus plugin is enabled,
-- so that it can avoid exporting unnecessary metrics if not
prometheus:metric_data(write_fn, not IS_PROMETHEUS_ENABLED)
wasm.metric_data()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be conditional to something, i.e. kong.configuration.wasm being enabled? Or, alternatively, should the kong.plugins.prometheus.wasmx module's functions be made noop when wasm is disabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function metric_data() in kong.plugins.prometheus.wasmx returns early if support for wasm isn't enabled in the gateway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I left the check for wasm support within metric_data() to keep the changes in existing code to a minimum.

for metric_by_label in sorted_pairs(parsed) do
buf:put(serialize_metric(metric_by_label, buf))

i = i + 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: looks like i is only used here and declared outside, perhaps we could have sorted_iter return index and value and use the index here instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!


table_sort(sorted_keys)

return sorted_iter, { t = t, sorted_keys = sorted_keys, i = 1 }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the standard signature of an iterator function should take the index as the second parameter, after the context. Is there a reason for having i inside the context table here? (It works regardless, just wondering if it's just a style choice).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point.

It wasn't a deliberate choice, no. I think I misinterpreted that section of the book and oversaw the part describing the index of the iteration as a control variable and not part of the invariant state of the loop -- which is logical.

Thanks for mentioning it. I'll change the code accordingly.

local _M = {}


local FLUSH_EVERY = 100
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be configurable or is there a reason for always using 100 in particular?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100 was chosen rather arbitrarily. I haven't tested different values and picked the optimum one.

The reasoning was rather simplistic. By default ngx_wasm_module dedicates 5m of shared memory to metrics storage; which is enough to store ~ 30k counters/gauges.

Flushing every 100 metrics implicates that we write to the HTTP response 3k times -- which sounded like a good balance between IO serialization and memory consumption.

I think exposing FLUSH_EVERY as configurable parameter without giving users context and numbers to guide their decision would be more burdensome than helpful.

local label_k = str_sub(label_kv, 0, str_find(label_kv, "="))
local label_k_start, _ = str_find(key, label_k)

first_label = (label_k_start < first_label) and label_k_start or first_label
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this more of a first_label_start_index than a first_label?

end


local function parse_pw_key(key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm struggling a bit with this function, seems like key is a string that starts with pw. with sections that are separated by dots, would it be possible to add an example of what is expected here as a code comment?

At first glance it feels like it could be simplified a bit either using regex or splitting by dots - maybe, but I have the feeling this was purposely written to maximise performance over readability?

@flrgh flrgh self-requested a review October 16, 2024 16:39
@@ -402,6 +402,8 @@ local function rebuild_state(db, version, old_state)

for _, filter in ipairs(chain.filters) do
if filter.enabled then
_M.filters_by_name[filter.name].config = cjson_decode(filter.config) or filter.config
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may exist more than one filter by the same name (and likely with differing configuration).

Comment on lines +52 to +54
local filter_config = wasm.filters_by_name[filter_name].config or {}
local patterns = filter_config.pw_metrics
and filter_config.pw_metrics.label_patterns or {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel iffy about intermingling the filter's (http) configuration with the filter's metric information.

Would it be possible to use filter metadata (the ${filter}.meta.json file) for this? For instance, currently we're using filter metadata to allow developers to provide JSON schema for filters with JSON configurations. Metric data seems to me like a comparable use case:

{
  "config_schema": {
    "type": "object",
    "properties": { ... }
  },
  "metrics": {
    "label_patterns": [
      { "label": "service", "pattern": "(_s_id=([0-9a-z%-]+))" },
      { "label": "route",   "pattern": "(_r_id=([0-9a-z%-]+))" }
    ]
  }
}

I still need to spend some more time grokking the new metrics feature, but lemme know what you think in the meantime.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is possible. I totally oversaw the metadata option. Thanks for bringing that up.
In fact, the filter metadata is a more appropriate place for storing something that will be common to all filter instances.

local ngx_say = ngx.say


local _M = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first impression I had from seeing this file was that this looks pretty tightly coupled to the implementation details of ngx_wasm_module's metrics storage (well, and also to Prometheus, I suppose). Is it not feasible to put this code into resty.wasmx.* with a more simplified API that Kong gateway can use to register/inspect/collect metric [meta]data?

@@ -24,9 +24,9 @@ ATC_ROUTER=ffd11db657115769bf94f0c4f915f98300bc26b6 # 1.6.2
SNAPPY=23b3286820105438c5dbb9bc22f1bb85c5812c8a # 1.2.0

KONG_MANAGER=nightly
NGX_WASM_MODULE=96b4e27e10c63b07ed40ea88a91c22f23981db35
NGX_WASM_MODULE=9b1b2c760f73827fc08ade3a936a89fa5473f8fa
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just an FYI I juuuust merged #13765, so you can drop the deps changes and rebase to pick them up from master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author/community PRs from the open-source community (not Kong Inc) build/bazel cherry-pick kong-ee schedule this PR for cherry-picking to kong/kong-ee core/proxy core/wasm Everything relevant to [proxy-]wasm plugins/prometheus pr/wip A work in progress PR opened to receive feedback size/XL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants