-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add index_types for OTEL logs and metrics #3148 #3929
Add index_types for OTEL logs and metrics #3148 #3929
Conversation
8b3fedb
to
f62f4ab
Compare
@juergen-walter , This is great, thanks for starting this draft! Have you compared this schema with the OpenSearch simple schema for observability? I also don't see some of the fields from the logs or metrics in here. Are you using dynamic mapping for those? We recently added support for composable index templates. There is a directly |
5eed787
to
628f865
Compare
@dlvenable happy to hear your encouraging feedback. I shared early WIP to let you know I started working on it, still a lot to be improved.
Initially I just copied the index templates for trace spans, adjustments for metrics and traces still to be done. For the open source contribution in this PR I would align with the Simple Schema for Observability, thank you for the reminder.
If you plan prepare or merge changes before this PR has been merged, it would be nice to ping me so I can update the PR |
@dlvenable I am not so sure about the Simple Scheme for Observability. We are providing an OpenTelemetry endpoint, that should support generic data. I would expect support for the OpenTelemetry semantic conventions as a whole: https://github.com/open-telemetry/semantic-conventions. I have run a small PoC, that transfers the YAML configuration from https://github.com/open-telemetry/semantic-conventions/tree/main/model into composable index templates. Each convention maps to one template. Certain assumptions must be made on those mappings. For the three signals traces, metrics and logs a template can be generated covering the base fields. All these templates can than be joined to form a full index pattern to be used. That allows to pick and choose, which conventions should be used. However, the full solution is not yet ready. |
91c2e8b
to
efda3e7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Index templates should be improved. Most of the string value fields should be keywords to allow aggregations. And most of them will not have values requiring type "text". Probably all float values should be doubles as most integers should be longs.
...repper-plugins/opensearch/src/main/resources/index-template/logs-otel-v1-index-template.json
Show resolved
Hide resolved
...repper-plugins/opensearch/src/main/resources/index-template/logs-otel-v1-index-template.json
Show resolved
Hide resolved
...repper-plugins/opensearch/src/main/resources/index-template/metrics-otel-index-template.json
Outdated
Show resolved
Hide resolved
...repper-plugins/opensearch/src/main/resources/index-template/metrics-otel-index-template.json
Outdated
Show resolved
Hide resolved
...repper-plugins/opensearch/src/main/resources/index-template/metrics-otel-index-template.json
Outdated
Show resolved
Hide resolved
...repper-plugins/opensearch/src/main/resources/index-template/metrics-otel-index-template.json
Outdated
Show resolved
Hide resolved
data-prepper-plugins/opensearch/src/main/resources/logs-policy-no-ism-template.json
Show resolved
Hide resolved
data-prepper-plugins/opensearch/src/main/resources/logs-policy-with-ism-template.json
Show resolved
Hide resolved
data-prepper-plugins/opensearch/src/main/resources/metrics-otel-v1-index-template.json
Show resolved
Hide resolved
data-prepper-plugins/opensearch/src/main/resources/metrics-policy-no-ism-template.json
Show resolved
Hide resolved
data-prepper-plugins/opensearch/src/main/resources/logs-otel-v1-index-template.json
Show resolved
Hide resolved
...repper-plugins/opensearch/src/main/resources/index-template/logs-otel-v1-index-template.json
Show resolved
Hide resolved
I closed all the related conversations and pointed to the Simple Schema for Observability mappings I aligned with. I consider the comments by @KarstenSchnitter to be a valuable feedback/review of the simple schema mappings but I would try to avoid having the respective discussion in this PR. |
Hi @dlvenable we would appreciate if we can include this into the next release, so we do not have to implement workarounds. |
Signed-off-by: Jürgen Walter <[email protected]>
Signed-off-by: Jürgen Walter <[email protected]>
Fixes testInstantiateSinkMetricsDefaultMetricSink Alertnative would have been to adjust the test Signed-off-by: Jürgen Walter <[email protected]>
Signed-off-by: Jürgen Walter <[email protected]>
Signed-off-by: Jürgen Walter <[email protected]>
Signed-off-by: Jürgen Walter <[email protected]>
Signed-off-by: Jürgen Walter <[email protected]>
b894f62
to
a1ae0d1
Compare
Signed-off-by: Jürgen Walter <[email protected]>
Signed-off-by: Jürgen Walter <[email protected]>
Signed-off-by: Jürgen Walter <[email protected]>
This reverts commit 3fd61af. Signed-off-by: Jürgen Walter <[email protected]>
4c5806d
to
bebe21a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @juergen-walter ! The build is now passing. I think one Java 11 build has failed, but this is likely a flaky test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I managed to review the index templates. There are only minor issues. Please have a look at the text
types within the instrumentation scope fields. My main question is, why the metrics and logs index templates are essentially doubled. Is there a way to have only one template for each signal?
"path": "observedTimestamp" | ||
}, | ||
"traceId": { | ||
"ignore_above": 256, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are actually hex encoded 16 bytes values, hence 32 characters. You can shorten the length to this value.
"ignore_above": 256, | |
"ignore_above": 32, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"type": "keyword" | ||
}, | ||
"spanId": { | ||
"ignore_above": 256, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are actually hex encoded 8 bytes values, hence 16 characters. You can shorten the length to this value.
"ignore_above": 256, | |
"ignore_above": 16, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
}, | ||
"kind": { | ||
"type": "keyword", | ||
"ignore_above": 128 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are currently only 5 kinds: SUM
, GAUGE
, HISTOGRAM
, EXPONENTIAL_HISTOGRAM
and SUMMARY
. The longest has 21 characters. You could decrease this value severly.
"ignore_above": 128 | |
"ignore_above": 24 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
}, | ||
"aggregationTemporality": { | ||
"type": "keyword", | ||
"ignore_above": 128 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are currently only 2 possible values for this or an empty value: AGGREGATION_TEMPORALITY_DELTA
and AGGREGATION_TEMPORALITY_CUMULATIVE
"ignore_above": 128 | |
"ignore_above": 40 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"type": "boolean" | ||
}, | ||
"startTime": { | ||
"type": "date" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If observedTimestamp
is data_nanos
, why is this not the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"count": { | ||
"type": "long" | ||
}, | ||
"exemplar": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this not a nested field? I would also have expected some kind of value for the exemplar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"exemplar": { | ||
"properties": { | ||
"time": { | ||
"type": "date" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this not a date_nanos
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"instrumentationScope": { | ||
"properties": { | ||
"name": { | ||
"type": "text", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, there is no benefit in having a text
for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} | ||
}, | ||
"version": { | ||
"type": "text", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, there is no benefit in having a text
for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related discussion about aligning OpenTelemetry metrics index templates with Data Pepper opensearch-project/opensearch-catalog#197 related to the open points mentioned by @KarstenSchnitter |
"instrumentationScope": { | ||
"properties": { | ||
"name": { | ||
"type": "text", | ||
"fields": { | ||
"keyword": { | ||
"type": "keyword", | ||
"ignore_above": 128 | ||
} | ||
} | ||
}, | ||
"version": { | ||
"type": "text", | ||
"fields": { | ||
"keyword": { | ||
"type": "keyword", | ||
"ignore_above": 256 | ||
} | ||
} | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"instrumentationScope": { | |
"properties": { | |
"name": { | |
"type": "text", | |
"fields": { | |
"keyword": { | |
"type": "keyword", | |
"ignore_above": 128 | |
} | |
} | |
}, | |
"version": { | |
"type": "text", | |
"fields": { | |
"keyword": { | |
"type": "keyword", | |
"ignore_above": 256 | |
} | |
} | |
}, | |
"instrumentationScope": { | |
"properties": { | |
"name": { | |
"type": "keyword", | |
"ignore_above": 256 | |
}, | |
"version": { | |
"type": "keyword", | |
"ignore_above": 256 | |
}, |
Signed-off-by: Jürgen Walter <[email protected]>
Signed-off-by: Jürgen Walter <[email protected]>
Add fields data prepper writes
Description
Add index_types for OTEL logs and metrics
Issues Resolved
Resolves #3148
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.