Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a different log file for sensitive data #4129

Closed
wants to merge 8 commits into from

Conversation

belimawr
Copy link
Contributor

@belimawr belimawr commented Jan 24, 2024

What does this PR do?

This PR introduces the 'sensitive logger': when gathering logs from any component, if the log entry contains the key/value log.type: sensitive it is logged to a different file.

Why is it important?

It prevents raw events and sensitive data logged by components from being mixed with the normal logs and shipped to monitoring clusters.

Open questions

Does the sensitive logger needs to be configurable?

I made the sensitive logger non configurable, following the example from the logs written to data/elastic-agent-<hash>/logs/ because the sensitive logger also writes there.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • [ ] I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in ./changelog/fragments using the changelog tool
  • I have added an integration test or an E2E test

Author's Checklist

  • Validate the sensitive log files are collected together with diagnostics

How to test this PR locally

  1. Package the Elastic-Agent
  2. Replace the Filebeat binary by the binary built from Log raw events and errors containing events to a separate file beats#37475
  3. Create /tmp/flog.log with a few lines, the data is not important
  4. Start the Elastic-Agent with the following configuration (adjust if needed)
outputs:
  default:
    type: elasticsearch
    hosts:
        - http://localhost:9200
    username: elastic
    password: changeme
    preset: balanced

inputs:
  - type: filestream
    id: your-input-id
    streams:
      - id: your-filestream-stream-id
        data_stream:
          dataset: generic
        paths:
          - /tmp/flog.log

agent.monitoring:
  enabled: false
  logs: false
  metrics: false
  pprof.enabled: false
  use_output: default
  http:
      enabled: false

agent.logging.to_stderr: true
agent.logging.metrics.enabled: false

To create ingest failures the easiest way is to close the write index from the datastream, to do that go to Kibana -> Dev Tools

To get the backing index for a datastream:

GET /_data_stream/logs-generic-default

This will return something like:

{
  "data_streams": [
    {
      "name": "logs-generic-default",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-logs-generic-default-2024.01.22-000001",
          "index_uuid": "0pq-XIYfSjuUQhTxlJKJjQ",
          "prefer_ilm": true,
          "ilm_policy": "logs",
          "managed_by": "Index Lifecycle Management"
        }
      ]
    }
  ]
}

Take note of the index_name .ds-logs-generic-default-2024.01.22-000001.
Close this index:

POST .ds-logs-generic-default-2024.01.22-000001/_close
  1. Add more data to the file /tmp/flog.log
  2. In the folder you're running the Elastic-Agent, look for a log file in data/elastic-agent-<hash>/logs/sensitive the file name is something like elastic-agent-sensitive-20240125.ndjson. You should see a log entry like this one:
{
  "log.level": "warn",
  "@timestamp": "2024-01-25T14:48:51.115+0100",
  "message": "Cannot index event publisher.Event{Content:beat.Event{Timestamp:time.Date(2024, time.January, 25, 14, 48, 46, 614819591, time.Local), Meta:{\"input_id\":\"your-input-id\",\"raw_index\":\"logs-generic-default\",\"stream_id\":\"your-filestream-stream-id\"}, Fields:{\"agent\":{\"ephemeral_id\":\"a06806a9-f18d-4ffa-bee1-debcc15f7cf5\",\"id\":\"0ff4eb46-71e1-4c49-a921-3b984b303c0f\",\"name\":\"millennium-falcon\",\"type\":\"filebeat\",\"version\":\"8.13.0\"},\"data_stream\":{\"dataset\":\"generic\",\"namespace\":\"default\",\"type\":\"logs\"},\"ecs\":{\"version\":\"8.0.0\"},\"elastic_agent\":{\"id\":\"0ff4eb46-71e1-4c49-a921-3b984b303c0f\",\"snapshot\":false,\"version\":\"8.13.0\"},\"event\":{\"dataset\":\"generic\"},\"host\":{\"architecture\":\"x86_64\",\"containerized\":false,\"hostname\":\"millennium-falcon\",\"id\":\"851f339d77174301b29e417ecb2ec6a8\",\"ip\":[\"42.42.42.42\",,\"ec8a:fc90:d347:6316:116e:8a27:f731:08ff\"],\"mac\":[\"95-A2-37-0D-71-73\",],\"name\":\"millennium-falcon\",\"os\":{\"build\":\"rolling\",\"family\":\"arch\",\"kernel\":\"6.7.0-arch3-1\",\"name\":\"Arch Linux\",\"platform\":\"arch\",\"type\":\"linux\",\"version\":\"\"}},\"input\":{\"type\":\"filestream\"},\"log\":{\"file\":{\"device_id\":\"34\",\"inode\":\"172876\",\"path\":\"/tmp/flog.log\"},\"offset\":1061765},\"message\":\"154.68.172.7 - ritchie3302 [25/Jan/2024:14:10:52 +0100] \\\"HEAD /supply-chains/metrics/platforms HTTP/1.1\\\" 502 13383\"}, Private:(*input_logfile.updateOp)(0xc000fc6d20), TimeSeries:false}, Flags:0x1, Cache:publisher.EventCache{m:mapstr.M(nil)}} (status=400): {\"type\":\"index_closed_exception\",\"reason\":\"closed\",\"index_uuid\":\"0pq-XIYfSjuUQhTxlJKJjQ\",\"index\":\".ds-logs-generic-default-2024.01.22-000001\"}, dropping event!",
  "component": {
    "binary": "filebeat",
    "dataset": "elastic_agent.filebeat",
    "id": "filestream-default",
    "type": "filestream"
  },
  "log": {
    "source": "filestream-default"
  },
  "log.origin": {
    "file.line": 461,
    "file.name": "elasticsearch/client.go",
    "function": "github.com/elastic/beats/v7/libbeat/outputs/elasticsearch.(*Client).bulkCollectPublishFails"
  },
  "log.type": "sensitive",
  "ecs.version": "1.6.0",
  "log.logger": "elasticsearch"
}

Note the "log.type": "sensitive" and that this log entry is not present in other log files or the logs that go to stdout/stderr.

Related issues

## Use cases
## Screenshots
## Logs

Questions to ask yourself

  • How are we going to support this in production?
  • How are we going to measure its adoption?
  • How are we going to debug this?
  • What are the metrics I should take care of?
  • ...

@belimawr belimawr added the Team:Elastic-Agent Label for the Agent team label Jan 24, 2024
@belimawr belimawr self-assigned this Jan 24, 2024
Copy link
Contributor

mergify bot commented Jan 24, 2024

This pull request does not have a backport label. Could you fix it @belimawr? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v./d./d./d is the label to automatically backport to the 8./d branch. /d is the digit

NOTE: backport-skip has been added to this pull request.

Copy link
Contributor

mergify bot commented Jan 25, 2024

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b sensitive-logger-for-process upstream/sensitive-logger-for-process
git merge upstream/main
git push upstream sensitive-logger-for-process

@belimawr belimawr force-pushed the sensitive-logger-for-process branch 2 times, most recently from 00d1eb4 to 4be50d6 Compare January 25, 2024 16:43
@belimawr belimawr changed the title [WIP]Add sensitive logger Use a different log file for sensitive data Jan 25, 2024
@belimawr belimawr added the enhancement New feature or request label Jan 25, 2024
Copy link
Contributor

mergify bot commented Feb 12, 2024

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b sensitive-logger-for-process upstream/sensitive-logger-for-process
git merge upstream/main
git push upstream sensitive-logger-for-process

@belimawr
Copy link
Contributor Author

Closing this as the behaviour has changed and it was easier to start from a new branch. The new PR: #4549

@belimawr belimawr closed this Apr 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-skip enhancement New feature or request Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant