-
Notifications
You must be signed in to change notification settings - Fork 420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ECS can no longer map all components out of the box #2326
Comments
ECS / Semconv will keep growing but I don't think every user should have to include all 2000+ fields by default. ECS is split up into Core and Extended field and by now I think we should split it up further. What we did with the dynamic mappings is partially convert what we have in ECS as specifying each field, moving to ECS being based on patterns. For example For the full template you showed above, how is this generated? I guess short term fix for it is it to increase the 2000 limit. Should we move fully to dynamic templates? My opinion is yes. But as mentioned above, I would like to keep the |
@eyalkoren @felixbarny For awareness. |
Are there any shortcomings to dynamic templates that overcome their advantages? |
I think the fix for this specific case (when you want to eagerly map all ECS fields) is to increase the field limit. The default is 1000 so I assume that you've already increased the default from 1000 to 2000. I've added a comment to elastic/elasticsearch#85146 (comment) to discuss the tradeoffs of mapping all fields vs only mapping fields that differ from the default mapping. |
I can raise the default field size but of course I rather not map all these fields to begin with.
I don't think we should, even the core/extended distinction is already rather ambiguous to users. Ideally we can message to our users a datastream is ECS/SemConv 'ready' for any field defined in the specification. Whether we should revisit if 'extended' fields should be part of that specification to begin with I'll leave open for discussion at a later point.
This does in effect create a new segmentation, a datastream is not fully ECS/SemConv ready. Only an undocumented subset is. This is cognitive overhead we push to our users.
If we can get to a place where the specification encodes these conventions and we can generate the current I just hope we get to a place where the templates we ship are exhaustive and spec driven Exhaustive can take many forms:
Whether exhaustive means enforcing field types or not can be a follow up discussion too:
I personally feel strongly we should do type checks/coercion and enforce structure since this is explicitly part of the ECS spec. Although how move along forward with this is still open too: https://github.com/elastic/opentelemetry-dev/issues/23. Ideally we are lenient what we accept, as per logs-db efforts, but strict in how we return data. Again Elasticsearch could take advantage of the ECS spec to know how to return ECS data structurally even with synthetic source enabled. |
Just to be explcit the |
@eyalkoren Can you comment on this? I thought we actually have tests in place in Elasticsearch to ensure this is not the case. |
We do have tests that ensure that every field that is defined in ECS is mapped correctly, assuming the field is sent in the correct JSON type.
Because we have tests that all ECS fields are mapped correctly, I think we can do that.
The
Fair ask but note that this also comes with tradeoffs, as noted in elastic/elasticsearch#85146 (comment). As |
For reference, said tests are in |
@felixbarny I think this is where we may have a real issue here to fix. Correct me if I am wrong, but our default |
Would this work? Will it have an impact on performance? We should definitely test if this works. |
The That's one of the reasons why I think that by default, it may be reasonable to expect shippers to send data in the correct data type. |
I want to stress that not enforcing e.g folks can still send a In other words we break the robustness principle: https://en.wikipedia.org/wiki/Robustness_principle by being both lenient in what we accept and what we emit. https://github.com/elastic/opentelemetry-dev/issues/23 This does not mean to give up on e.g _synthetic source but hopefully _synthetic source enabled indices can take advantage of the knowledge they hold ECS data to return that data in accordance with the schema.
Rejecting documents is a nonstarter agreed. Ignoring fields is one option or rerouting them to There's a hidden superpower lost if we can no longer generate structs in several different languages so we can read each others events (security/observability/stack/etc) with a certain degree of confidence because we adhere to a schema. |
As of 8.11 applying all the ECS component templates is no longer possible.
Running this against Elasticsearch 8.11 will yield:
We should be able to grow with our users and allow them to use any ECS field ahead of mapping changes.
In newer Elasticsearch version we ship with an
ecs@mappings
OOTB however this is not exhaustive and can lead to confusion with our users see e.g: elastic/elasticsearch#85146 (comment)How do ensure the full ECS specification (and/or its merger with OpenTelemetry) is fully natively supported by Elasticsearch?
Should the components move to
dynamic_templates
for all fields?The text was updated successfully, but these errors were encountered: