Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No System logs are generated for Linux agent, when agent is installed with --unprivileged flag. #4112

Closed
amolnater-qasource opened this issue Jan 22, 2024 · 13 comments
Labels
bug Something isn't working impact:high Short-term priority; add to current release, or definitely next. QA:Ready For Testing Code is merged and ready for QA to validate Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

Comments

@amolnater-qasource
Copy link

Kibana Build details:

VERSION: 8.13.0-SNAPSHOT
BUILD: 70749
COMMIT: a0f4897f7c04069faf2a86dbda1dabea78c161c1
Artifact Link: https://snapshots.elastic.co/8.13.0-l534sdis/downloads/beats/elastic-agent/elastic-agent-8.13.0-SNAPSHOT-linux-x86_64.tar.gz

Host OS: Linux- SLES15, Ubuntu 22

Preconditions:

  1. 8.13.0 Snapshot linux agent should be installed.
  2. Linux Agents should be installed using below command:
    sudo ./elastic-agent install --url=<url> --enrollment-token=<token> --unprivileged

Steps to reproduce:

  1. Navigate to Data Streams tab.
  2. Select Type: logs, System Integration, and required namespace filters.
  3. Observe no System integration logs for linux agent installed with --unprivileged flag.

What's working fine:

  • Issue is not reproducible on Linux- SLES15, Ubuntu 22 when agents are installed without --unprivileged flag.

Screen Recording:

Data.streams.-.Fleet.-.Elastic.-.Google.Chrome.2024-01-22.13-59-51.mp4

Logs:
elastic-agent-diagnostics-2024-01-22T08-42-48Z-00.zip
elastic-agent-diagnostics-2024-01-22T08-42-55Z-00.zip

Expected Result:
System logs should be generated for Linux agent, when agent is installed with --unprivileged flag.

Feature:
https://github.com/elastic/ingest-dev/issues/1766

@amolnater-qasource amolnater-qasource added bug Something isn't working Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team impact:high Short-term priority; add to current release, or definitely next. labels Jan 22, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@amolnater-qasource
Copy link
Author

@manishgupta-qasource Please review.

@manishgupta-qasource
Copy link

Secondary review for this ticket is Done

@pierrehilbert
Copy link
Contributor

@blakerouse could you please have a look here?
I can see some errors:
{"log.level":"debug","@timestamp":"2024-01-22T08:40:56.617Z","message":"Error fetching PID info for 1, skipping: FillPidMetrics: error fetching IO metrics for pid 1: error fetching IO metrics: open /proc/1/io: permission denied","component":{"binary":"metricbeat","dataset":"elastic_agent.metricbeat","id":"system/metrics-default","type":"system/metrics"},"log":{"source":"system/metrics-default"},"log.origin":{"file.line":173,"file.name":"process/process.go","function":"github.com/elastic/elastic-agent-system-metrics/metric/system/process.(*Stats).pidIter"},"service.name":"metricbeat","ecs.version":"1.6.0","log.logger":"processes","ecs.version":"1.6.0"}

@cmacknz
Copy link
Member

cmacknz commented Jan 22, 2024

This is not a bug, this is expected. We are likely also missing several metrics, that error looks like we can't read /proc/1 which would be the process information for the PID 1 or the init system. The system integration is probably going to have the most problems operating as non-root.

Copying a previous comment on this:

In the case of the system integration, on Linux we include reading the contents of /var/log/auth.log and /var/log/syslog by default but non-root users cannot read these on recent Ubuntu versions. The auth and system log datastreams are going to be empty.

testuser@valuable-gudgeon:/home/ubuntu$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 22.04.3 LTS
Release:	22.04
Codename:	jammy
testuser@valuable-gudgeon:/home/ubuntu$ ls -la /var/log/
total 1452
drwxrwxr-x   8 root      syslog             4096 Jan 11 14:34 .
drwxr-xr-x  13 root      root               4096 Dec 10 21:18 ..
-rw-r--r--   1 root      root                452 Dec 10 21:19 alternatives.log
drwxr-xr-x   2 root      root               4096 Dec 10 21:19 apt
-rw-r-----   1 syslog    adm               11988 Jan 11 14:41 auth.log
-rw-rw----   1 root      utmp                  0 Dec 10 21:17 btmp
-rw-r-----   1 root      adm                7516 Jan 11 14:34 cloud-init-output.log
-rw-r-----   1 syslog    adm              204731 Jan 11 14:34 cloud-init.log
drwxr-xr-x   2 root      root               4096 Aug  2 11:53 dist-upgrade
-rw-r-----   1 root      adm               35909 Jan 11 14:34 dmesg
-rw-r-----   1 root      adm               37307 Jan  9 15:27 dmesg.0
-rw-r--r--   1 root      root               7904 Dec 10 21:19 dpkg.log
drwxr-sr-x+  3 root      systemd-journal    4096 Jan  9 15:26 journal
-rw-r-----   1 syslog    adm               92532 Jan 11 14:39 kern.log
drwxr-xr-x   2 landscape landscape          4096 Jan  9 15:26 landscape
-rw-rw-r--   1 root      utmp             296592 Jan 11 14:40 lastlog
drwx------   2 root      root               4096 Jan  9 15:26 private
-rw-r-----   1 syslog    adm             1030960 Jan 11 14:47 syslog
drwxr-x---   2 root      adm                4096 Jan  9 15:26 unattended-upgrades
-rw-rw-r--   1 root      utmp               6400 Jan 11 14:34 wtmp
testuser@valuable-gudgeon:/home/ubuntu$ cat /var/log/syslog
cat: /var/log/syslog: Permission denied
testuser@valuable-gudgeon:/home/ubuntu$ cat /var/log/auth.log
cat: /var/log/auth.log: Permission denied

In this case the work around would be to add the elastic-agent user to the adm group as that is the group with access to read those log files. Something similar may work /proc.

We need to decide if this needs to be an action taken by the agent or if it has to be a manual step from the user.

@pierrehilbert
Copy link
Contributor

This is what we mentioned in the past: testing will allow us to highlight what is not working as expected and we will have to decide what solution we will pick for those failing:

  • Fix the issue (by giving more permissions to the user, changing the behavior, etc.)
  • Degrade the integration (for example only getting processes you are allowed to see and document how to get more)
  • Make it clear that this integration is not supported when Agent is run unprivileged

@cmacknz
Copy link
Member

cmacknz commented Jan 22, 2024

We need to go through the system integration datastream by datastream on each OS and compare what it produces with --unpriviledged to what it produces without it. There are more issues than just this one lurking here.

@blakerouse
Copy link
Contributor

@cmacknz is correct, this is expected. I do wonder if metrics are collected for the PID's that it can read, I would hope that metricbeat just continues on and reads the metrics for the PID's it can.

@nimarezainia
Copy link
Contributor

I don't think agent should do anything to change group ownership. This is an admin decision. I see it our job to educate them to do this - if this datastream is needed by them.

In order to do that what we need is:

  • Highlight the issue via degradation of the integration
  • Enhance the documentation to help the user navigate what they need to do (with a tool-tip or pointer from Fleet UI)

if you agree can we use this issue for the first point?

@nimarezainia
Copy link
Contributor

@amolnater-qasource thanks for this. Is this the only issue you see with System Integration? could we get the full list of issues for tracking purposes. thanks

@amolnater-qasource
Copy link
Author

amolnater-qasource commented Jan 23, 2024

Hi @nimarezainia

The test plan is in progress and we haven't encountered issues for any other integrations as of now.

  • System Metrics ✅
  • Linux Integration ✅
  • Custom Logs ✅
  • Nginx ✅
  • Redis ✅
  • MySQL ✅
  • Kafka ✅
  • RabbitMQ ✅
  • Osquery Manager ✅
  • Elastic Defend ❌

Further, Elastic Defend is not supported as of now so the related feature testcases are also BLOCKED.

Screenshot:
image

We are executing test plan at link Fleet Feature Regression Test plan

Please let us know if we are missing anything here.
Thanks!

@jlind23
Copy link
Contributor

jlind23 commented Jan 23, 2024

Highlight the issue via degradation of the integration

@nimarezainia shall we just highlight that it is not going to fully work or shall we details what is the exact data that will not be collected?

@jlind23
Copy link
Contributor

jlind23 commented Jun 5, 2024

Closing this as covered by elastic/beats#39733

@jlind23 jlind23 closed this as not planned Won't fix, can't repro, duplicate, stale Jun 5, 2024
@amolnater-qasource amolnater-qasource added the QA:Ready For Testing Code is merged and ready for QA to validate label Jun 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working impact:high Short-term priority; add to current release, or definitely next. QA:Ready For Testing Code is merged and ready for QA to validate Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
Development

No branches or pull requests

8 participants