Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extra Elastic Cloud agent policy agents after upgrading deployment in the cloud #3408

Closed
rdner opened this issue Mar 27, 2024 · 3 comments
Closed
Labels
bug Something isn't working

Comments

@rdner
Copy link
Member

rdner commented Mar 27, 2024

Version: 7.17.19->8.13.0, 8.12.2->8.13.0

Steps to Reproduce:

  1. Create a deployment in the cloud
  2. Make sure you see an agent running Elastic Cloud agent policy for the current version
  3. Upgrade the whole deployment to a newer version (I tried 7.17.19->8.13.0, 8.12.2->8.13.0)
  4. Go back to Management->Fleet and you'll see now 2 agents from the Elastic Cloud agent policy: one is the new version, another is from the old version. The old agent gets offline in a few minutes.
Screenshot 2024-03-27 at 10 22 32 Screenshot 2024-03-27 at 11 03 28

Expected behavior

Elastic Cloud agent policy agents are managed, so users cannot manually delete them. Seems like we would have an extra agent on the list per each deployment upgrade and users would not be able to clean them up.

I'd expect that these agents are cleaned up automatically during the upgrade process.

@rdner rdner added the bug Something isn't working label Mar 27, 2024
@jlind23
Copy link
Contributor

jlind23 commented Mar 27, 2024

@rdner I might be wrong but managed agents are docker container hence upgrading the cluster means deploying a new docker image instead of upgrading the current one that's most probably the reason why you see two agents. After some time this offline agent will become inactive anyway and won't show up in the UI.
cc @kpollich

@rdner
Copy link
Member Author

rdner commented Mar 27, 2024

After some time this offline agent will become inactive anyway and won't show up in the UI.

@jlind23 even it's true, I don't think it's good UX. There is no way for the user to clean this up and it will keep piling up with each deployment upgrade.

@kpollich
Copy link
Member

Agreed that is is a bad UX. For ephemeral agents specifically (e.g. Docker containers) where the ID changes the old "unenrollment timeout" model worked better than the new "inactivity timeout" model.

@nimarezainia created elastic/kibana#179399 recently that captures the need to support forced unenrollment of inactive agents for specific policies. We could make this unerollment behavior very aggressive for the managed cloud policy if we pursued this. I'm going to close this in favor of the other issue.

@kpollich kpollich closed this as not planned Won't fix, can't repro, duplicate, stale Mar 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants