Skip to content

Commit

Permalink
Add a detailed explanation about the different field definition field…
Browse files Browse the repository at this point in the history
…s. (#3792)

* Add a detailed explanation about the different field definition fields.

Added a description of the different files in the `fields` directory of the integration:
- agent.yml
- base-fields.yml
- ecs.yml
- fields.yml

* Verified the use of the file:// reference instead of GIT@ in the _dev/build/build.yml file to point at the ECS reference file.

* Expanded explanation on mapping methods

Further documented the four ways ECS mappings can be defined in integrations.
- Using the `fields` directory in the integration.
- The import_mappings: true option in the `_dev/build/build.yml` file.
- Fleet pre-installed template `ecs@mapping` (>8.13.0)
- Local ECS file

* Intermediary commit

* Integrated (no pun) feedback

Integrated the feedback and expanded the field section. Also replaced some nginx references with apache.

* Add link to fields section of general guidelines
  • Loading branch information
Oddly authored May 8, 2024
1 parent eec0eb1 commit d8aad09
Showing 1 changed file with 123 additions and 15 deletions.
138 changes: 123 additions & 15 deletions docs/en/integrations/build-integration.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -242,7 +242,7 @@ Learn more in the {ref}/ingest.html[ingest pipeline reference].
****

Ingest pipelines are defined in the `elasticsearch/ingest_pipeline` directory.
They only apply to the parent data stream within which they live.
They only apply to the parent data stream within which they live. For our example, this would be the `apache.access` dataset.

For example, the https://github.com/elastic/integrations/tree/main/packages/apache[Apache integration]:

Expand Down Expand Up @@ -294,10 +294,78 @@ Each document is a collection of fields, each having its own data type. When map
To learn more, see {ref}/mapping.html[mapping].
****

Mappings are defined in the `fields` directory.
Like ingest pipelines, mappings only apply to the parent data stream.
The Apache integration has four different field definitions:
In the integration, the `fields` directory serves as the blueprint used to create component templates for the integration. The content from all files in this directory will be unified when the integration is built, so the mappings need to be unique per data stream dataset.

Like ingest pipelines, mappings only apply to the data stream dataset, for our example the `apache.access` dataset.
+
NOTE: The names of these files are conventions, any file name with a `.yml` extension will work.

Integrations have had significant enhancements in how ECS fields are defined. Below is a guide on which approach to use, based on the version of Elastic your integration will support.
+
. ECS mappings component template (>=8.13.0)
Integrations *only* supporting version 8.13.0 and up, can use the https://github.com/elastic/elasticsearch/blob/c2a3ec42632b0339387121efdef13f52c6c66848/x-pack/plugin/core/template-resources/src/main/resources/ecs%40mappings.json[ecs@mappings] component template installed by Fleet.
This makes explicitly declaring ECS fields unnecessary; the `ecs@mappings` component template in Elasticsearch will automatically detect and configure them.
However, should ECS fields be explicitly defined, they will overwrite the dynamic mapping provided by the `ecs@mappings` component template.
They can also be imported with an `external` declaration, as seen in the example below.
+
. Dynamic mappings imports (<8.13.0 & >=8.13.0)
Integrations supporting the Elastic stack below version 8.13.0 can still dynamically import ECS field mappings by defining `import_mappings: true` in the ECS section of the `_dev/build/build.yml` file in the root of the package directory.
This introduces a https://github.com/elastic/elastic-package/blob/f439b96a74c27c5adfc3e7810ad584204bfaf85d/internal/builder/_static/ecs_mappings.yaml[dynamic mapping] with most of the ECS definitions.
Using this method means that, just like the previous approach, ECS fields don't need to be defined in your integration, they are dynamically integrated into the package at build time.
Explicitly defined ECS fields can be used and will also overwrite this mechanism.

An example of the aformentioned `build.yml` file for this method:
+
[source,yaml]
----
dependencies:
ecs:
reference: [email protected]
import_mappings: true
----
+
. Explicit ECS mappings
As mentioned in the previous two approaches, ECS mappings can still be set explicitly and will overwrite the dynamic mappings.
This can be done in two ways:
- Using an `external: ecs` reference to import the definition of a specific field.
- Literally defining the ECS field.

The `external: ecs` definition instructs the `elastic-package` command line tool to refer to an external ECS reference to resolve specific fields. By default it looks at the https://raw.githubusercontent.com/elastic/ecs/v8.6.0/generated/ecs/ecs_nested.yml[ECS reference] file hosted on Github.
This external reference file is determined by a Git reference found in the `_dev/build/build.yml` file, in the root of the package directory.
The `build.yml` file set up for external references:
+
[source,yaml]
----
dependencies:
ecs:
reference: [email protected]
----

Literal definition a ECS field:
[source,yaml]
----
- name: cloud.acount.id
level: extended
type: keyword
ignore_above: 1024
description: 'The cloud account or organ....'
example: 43434343
----

. Local ECS reference file (air-gapped setup)
By changing the Git reference in in `_dev/build/build.yml` to the path of the downloaded https://raw.githubusercontent.com/elastic/ecs/v8.6.0/generated/ecs/ecs_nested.yml[ECS reference] file, it is possible for the `elastic-package` command line tool to look for this file locally. Note that the path should be the full path to the reference file.
Doing this, our `build.yml` file looks like:
+
----
dependencies:
ecs:
reference: file:///home/user/integrations/packages/apache/ecs_nested.yml
----

The `access` data stream dataset of the Apache integration has four different field definitions:
+
NOTE: The `apache` integration below has not yet been updated to use the dynamic ECS field definition and uses `external` references to define ECS fields in `ecs.yml`.
+
[source,text]
----
apache
Expand All @@ -306,23 +374,63 @@ apache
│ │ └───elasticsearch/ingest_pipeline
│ │ │ default.yml
│ │ └───fields
│ │ agent.yml <1>
│ │ base-fields.yml <2>
│ │ ecs.yml <3>
│ │ fields.yml <4>
│ │ agent.yml
│ │ base-fields.yml
│ │ ecs.yml
│ │ fields.yml
│ └───error
│ │ └───elasticsearch/ingest_pipeline
│ │ │ default.yml
│ │ └───fields
│ │ agent.yml
│ │ base-fields.yml
│ │ ecs.yml
│ │ fields.yml
│ └───status
----
<1> `agent.yml` fields the Elastic agent uses
<2> `base-fields.yml` never changes and is required for all integrations
<3> Defines the relevant ECS fields
<4> Custom Apache access log fields

=== agent.yml
The `agent.yml` file defines fields used by default processors.
Examples: `cloud.account.id`, `container.id`, `input.type`

// Need more on mapping
=== base-fields.yml
In this file, the `data_stream` subfields `type`, `dataset` and `namespace` are defined as type `constant_keyword`, the values for these fields are added by the integration.
The `event.module` and `event.dataset` fields are defined with a fixed value specific for this integration:
- `event.module: apache`
- `event.dataset: apache.access`
Field `@timestamp` is defined here as type `date`.

=== ecs.yml
This file specifies every Elastic Common Schema (ECS) field used by the integration that is not defined in the files `agent.yml` or `base-fields.yml` files. It uses `external: ecs` references.
For example:
+
[source,yaml]
----
- external: ecs
name: client.ip
- external: ecs
name: destination.domain
----

=== fields.yml
Here we define fields that we need in our integration and are not found in the ECS.
The example below defines field `apache.access.ssl.protocol` in the Apache integration.
+
[source,yaml]
----
- name: apache.access
type: group
fields:
- name: ssl.protocol
type: keyword
description: |
SSL protocol version.
----

// Maybe something on ECS too??

Learn more about fields in the https://www.elastic.co/guide/en/integrations-developer/current/general-guidelines.html#_document_all_fields[general guidelines].

[[create-dashboards]]
== Create and export dashboards

Expand Down Expand Up @@ -705,13 +813,13 @@ vars:
required: true <1>
show_user: true <2>
title: Access log paths <3>
description: Paths to the nginx access log file. <4>
description: Paths to the apache access log file. <4>
type: text <5>
multi: true <6>
hide_in_deployment_modes: <7>
- agentless
default:
- /var/log/nginx/access.log*
- /var/log/httpd/access.log*
----
<1> option is required
<2> don't hide the configuration option (collapsed menu)
Expand Down

0 comments on commit d8aad09

Please sign in to comment.