[Elastic Agent] Add support for parsers configuration for Hints Autodiscover #3102

gizas · 2023-07-19T15:04:11Z

Describe the enhancement:

We need hint support for json and multiline parsers from annotations that defined from the users

Describe a specific use case for the enhancement or feature:

Agent configuration:

      providers:
        - type: kubernetes
          node: ${NODE_NAME}
          hints.enabled: true
          hints.default_config:
            type: filestream
            prospector.scanner.symlinks: true
            id: filestream-kubernetes-pod-${data.kubernetes.container.id}
            take_over: true
            paths:
            - /var/log/containers/*-${data.kubernetes.container.id}.log
            parsers:
            - container: ~

See above that type: filestream

The user will define in the pod:

annotations:
        co.elastic.hints/json.add_error_key: "true"
        co.elastic.hints/json.expand_keys: "true"
        co.elastic.hints/json.ignore_decoding_error: "true"
        co.elastic.hints/json.keys_under_root: "true"
        co.elastic.hints/json.message_key: "message"

And those will produce the following block:

parsers:
   - ndjson: 
         ignore_decoding_error: "true"
         expand_keys: "true"
         keys_under_root: "true"
         message_key: "message"

What is the definition of done?

To support the list of parsers https://www.elastic.co/guide/en/beats/filebeat/master/filebeat-input-filestream.html#_parsers
To provide examples of working parsers through annotations configuration and
Update the relevant documentation

gizas · 2023-08-03T08:28:09Z

@cmacknz , @rdner (because you worked in filestream input in the past) I cc you here in order to help us investigate how we can update the parsers configuration from a given user's input.

In beats we had the co.elastic.logs/json see https://www.elastic.co/guide/en/beats/filebeat/current/configuration-autodiscover-hints.html#_co_elastic_logsjson and we could configure the json log input.

But now in agent and filestream input we would like to offer the same flexibility to our users and through co.elastic.hints/json to configure the respective ndjson parser of filstream block.

For processors (see eg. https://github.com/elastic/elastic-agent/pull/3107/files#diff-ed933ad9cf09eb9ca797c5c89b5e8991624cbe5301d88f996d9bd947f4e3ea16R455) we have the relative map exposed and we can configure/ update the new list. I am struggling to find a way to see how I can edit parsers config. Any help here really appreciated

cmacknz · 2023-08-03T13:58:09Z

This is more of an agent composable provider question than a filestream question.

The parsers section goes at the root of the input configuration just like processors, so you can probably just clone or reuse the code that adds processors to add parsers. It should work exactly the same way, the name and options are just different. You will have to modify the code you linked.

gizas · 2023-08-04T09:06:22Z

@cmacknz thanks for looking into this. I was having a look into this today and I see that:
Initially (going from bottom to up), you need to modify the AddOrUpdate function and this leads me then to transpiler . So I guess you mean to go up all this way :)

Wondering if there is another way to do so? But still can not find something

gizas · 2023-08-04T14:33:26Z

@cmacknz see my f7503fb initial effort. At least elastic image builts, but parsers are not passed at the moment to composable provider. If you can spot anything missing let me know.

gizas · 2023-08-22T14:49:51Z

@cmacknz one step at a time, I fixed some errors and now at least the functionality now is not broken, autodiscovery works, but the parsers not added :)

The autodisovery reads the parsers that I provide with following annotations:

annotations:
        co.elastic.hints/parsers.ndjson.add_error_key: 'false'
        co.elastic.hints/parsers.ndjson.message_key: "pas"

Then comm.AddOrUpdate is called with correct data:
f7503fb#diff-ed933ad9cf09eb9ca797c5c89b5e8991624cbe5301d88f996d9bd947f4e3ea16L504

This leads to /internal/pkg/compsable/controller.go is called and the dynamicProvider object is passed:
https://github.com/elastic/elastic-agent/blob/processor_parsers/internal/pkg/composable/controller.go#L397
This is the pod that is emmited finally: https://github.com/elastic/elastic-agent/blob/processor_parsers/internal/pkg/composable/providers/kubernetes/pod.go#L247
Then this leads to https://github.com/elastic/elastic-agent-autodiscover/blob/main/kubernetes/watcher.go but from this point on I am loosing track.

Summary is that the processors are being added from autodiscovery but not parsers.
Full output for reference

- data_stream.namespace: default
  id: hints-filestream-container-logs-62d546c0055ce651842f8ec7b3911439555a66aaf6b238f87e765917a90cef66-kubernetes-e36973fb-debd-4252-a402-f7b185ce286e.nginx
  name: hints-filestream-container-logs
  original_id: hints-filestream-container-logs-62d546c0055ce651842f8ec7b3911439555a66aaf6b238f87e765917a90cef66
  processors:
  - add_fields:
      fields:
        id: 62d546c0055ce651842f8ec7b3911439555a66aaf6b238f87e765917a90cef66
        image:
          name: nginx
        runtime: containerd
      target: container
  - add_fields:
      fields:
        container:
          name: nginx
        ....
  - add_fields:
      fields:
        cluster:
          name: kind
          url: kind-control-plane:6443
      target: orchestrator
  - add_fields:
      fields:
        name: myproject
      target: project
  - rename:
      fail_on_error: "false"
      fields:
        "0":
          from: message
        "1":
          to: pasole
  streams:
  - data_stream:
      dataset: kubernetes.container_logs
      type: logs
    parsers:
    - container:
        format: auto
        stream: all
    paths:
    - /var/log/containers/*62d546c0055ce651842f8ec7b3911439555a66aaf6b238f87e765917a90cef66.log
    prospector:
      scanner:
        symlinks: true
  type: filestream
  use_output: default
outputs:
  default:
    allow_older_versions: true
    hosts:
    - https://elasticsearch:9200
    password: changeme
    ssl:
      verification_mode: none
    type: elasticsearch
    username: elastic
providers:
  kubernetes:
    hints:
      default_container_logs: false
      enabled: true
    node: kind-control-plane
    scope: node

cc @ChrsMark for reference

cmacknz · 2023-08-22T20:46:23Z

I haven't spent much time in this part of the agent code unfortunately, so I don't see anything obvious.

ChrsMark · 2023-08-23T08:50:45Z

Hey folks, coming late to this discussion. A couple of thoughts:

Do we have any examples where parsers are used in other than the filstream input? I'm trying to understand [Elastic Agent] Add support for parsers configuration for Hints Autodiscover #3102 (comment) mostly, and to my knowledge parsers are natively supported in filestream input. Do I miss sth here that parsers are supported in other inputs as well?

ref: filestream inputs
Based on this I see it as a different case compared to the processors, and hence it doesn't make sense to pass parsers for all kind of inputs (metrics etc) maybe?

In general, I'm not sure if we really want to change the DynamicProviderComm interface at f7503fb#diff-e84202e025c0b38310aa13257a34c1c4304473500c894e45eabbfe8567254fd9R25 and additionally pass the parsers information there.

For processors (in hints) it was different since the processors were already part of the DynamicProviderComm interface and hence we could take advantage of it. That helped to tackle #735. However, I would avoid extending DynamicProviderComm with additional variables because this would take us down an unpredictable path.

In this, I would suggest taking a step back and discuss what we want to achieve here. If I'm not mistaken we just want to support parsers' configurations through hints. I also see how the parsers' case is similar to the processor's one since it hits the same issue: #735

So how about just supporting parsers only in hints templates that are based on the filestream input? This means that parsers would be defined as placeholders within the templates like all other hints and would be populated in the hintsMapping.
I know it's a compromise but I'm not sure how we could achieve this when we hit #735.

Happy to brainstorm more on this if needed.

gizas · 2023-08-24T09:23:26Z

Hey, I wanted to give it a try to see if it can work tbh. Until now I have not seen any other example of parsers (@cmacknz can correct me on this)

I have tested the decode_json_fields processor with the filestream and hints autodicovery. I have added a doc section here elastic/ingest-docs@2c72477

So I guess for now we can park the parsers support with hints until #735 is prioritised. @cmacknz do you have any plans for this?

I think @ChrsMark the #3161 is getting bigger priority now in order to allign autodiscovery with beats.

Happy to brainstorm more on this if needed.

I will try to have a look next weeks as well but any ideas are more than welcome

cmacknz · 2023-08-24T14:26:13Z

#735 isn't on our roadmap at all right now.

Filestream is also the only input that uses parsers today AFAIK.

MichaelKatsoulis · 2024-01-30T09:30:46Z

This issue will be closed for now as json decoding can be performed with decode_json_fields processor.
So support for parsers is not mandatory. It can be reviewed when/if #735 gets prioritised.

gizas added Team: Cloud Native Integrations Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team and removed Team: Cloud Native Integrations labels Jul 19, 2023

gizas mentioned this issue Jul 19, 2023

[Meta] Improving Hints Autodiscovery with Elastic Agent #3063

Open

10 tasks

bturquet assigned MichaelKatsoulis Oct 27, 2023

MichaelKatsoulis closed this as completed Jan 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Elastic Agent] Add support for parsers configuration for Hints Autodiscover #3102

[Elastic Agent] Add support for parsers configuration for Hints Autodiscover #3102

gizas commented Jul 19, 2023

gizas commented Aug 3, 2023 •

edited

Loading

cmacknz commented Aug 3, 2023

gizas commented Aug 4, 2023 •

edited

Loading

gizas commented Aug 4, 2023 •

edited

Loading

gizas commented Aug 22, 2023 •

edited

Loading

cmacknz commented Aug 22, 2023

ChrsMark commented Aug 23, 2023

gizas commented Aug 24, 2023

cmacknz commented Aug 24, 2023

MichaelKatsoulis commented Jan 30, 2024

[Elastic Agent] Add support for parsers configuration for Hints Autodiscover #3102

[Elastic Agent] Add support for parsers configuration for Hints Autodiscover #3102

Comments

gizas commented Jul 19, 2023

gizas commented Aug 3, 2023 • edited Loading

cmacknz commented Aug 3, 2023

gizas commented Aug 4, 2023 • edited Loading

gizas commented Aug 4, 2023 • edited Loading

gizas commented Aug 22, 2023 • edited Loading

cmacknz commented Aug 22, 2023

ChrsMark commented Aug 23, 2023

gizas commented Aug 24, 2023

cmacknz commented Aug 24, 2023

MichaelKatsoulis commented Jan 30, 2024

gizas commented Aug 3, 2023 •

edited

Loading

gizas commented Aug 4, 2023 •

edited

Loading

gizas commented Aug 4, 2023 •

edited

Loading

gizas commented Aug 22, 2023 •

edited

Loading