Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return 503 Service Unavailable when unable to authenticate with Elasticsearch instead of 401 #3929

Closed
cmacknz opened this issue Sep 20, 2024 · 1 comment · Fixed by #3935
Closed
Labels
Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

Comments

@cmacknz
Copy link
Member

cmacknz commented Sep 20, 2024

When we attempt to validate an agent's API key and cannot communicate with ES because Fleet Server's API key/service token is temporarily unavailable we return a 401 to Elastic Agents. This seems correct, but if it happens for 7 check ins in a row will cause the agent to automatically unenroll.

We should evaluate handling "cannot auth with ES" from "explicitly told API key was invalid" separately and returning a 503 when Fleet Server itself cannot auth with ES to avoid unintentional mass unenrollments.

The relevant code block is below:

// Authenticate will return the SecurityInfo associated with the APIKey (retrieved from Elasticsearch).
// Note: Prefer the bulk wrapper on this API
func (k APIKey) Authenticate(ctx context.Context, es *elasticsearch.Client) (*SecurityInfo, error) {
token := fmt.Sprintf("%s%s", authPrefix, k.Token())
req := esapi.SecurityAuthenticateRequest{
Header: map[string][]string{AuthKey: []string{token}},
}
res, err := req.Do(ctx, es)
if err != nil {
return nil, fmt.Errorf("apikey auth request %s: %w", k.ID, err)
}
if res.Body != nil {
defer res.Body.Close()
}
if res.IsError() {
returnError := ErrUnauthorized
if res.StatusCode == 429 {
returnError = ErrElasticsearchAuthLimit
}
return nil, fmt.Errorf("%w: %w", returnError, fmt.Errorf("apikey auth response %s: %s", k.ID, res.String()))
}

@blakerouse
Copy link
Contributor

The actual lines in that block that are wrong are here:

https://github.com/elastic/fleet-server/blob/main/internal/pkg/apikey/auth.go#L54-L60

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants