You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The initial delay is 1s ramping up to 10s. We have a user report that delayed enrollment when there are many agent VM images starting before Fleet Server is available and ready to accept connections can DDoS their network infrastructure.
We should make the following changes:
There must be a random delay added before the first connection attempt to avoid each agent making it's initial request concurrnetly. The Fleet Gateway uses 500ms for this jitter duration.
Duration: 1*time.Second, // time between successful calls
Jitter: 500*time.Millisecond, // used as a jitter for duration
Backoff: backoffSettings{ // time after a failed call
Init: 60*time.Second,
Max: 10*time.Minute,
},
}
The maximum backoff duration when using delayed enrollment should be increased. The Fleet gateway for checkin requests uses 10 minutes for the maximum period.
Possibly, the delayed enrollment and fleet gateway checkins should use the same backoff algorithm since they are both critical operations that reach out to Fleet Server indefinitely.
The text was updated successfully, but these errors were encountered:
#4727 made us retry indefinitely when delayed enrollment is used. This works, and does include an exponential backoff, but with a very short duration.
The backoff for delayed enrollment is implemented in
elastic-agent/internal/pkg/agent/cmd/enroll_cmd.go
Line 277 in 345f2ae
elastic-agent/internal/pkg/agent/cmd/enroll_cmd.go
Line 532 in 345f2ae
elastic-agent/internal/pkg/agent/cmd/enroll_cmd.go
Lines 57 to 58 in be179d8
The initial delay is 1s ramping up to 10s. We have a user report that delayed enrollment when there are many agent VM images starting before Fleet Server is available and ready to accept connections can DDoS their network infrastructure.
We should make the following changes:
elastic-agent/internal/pkg/agent/application/gateway/fleet/fleet_gateway.go
Lines 39 to 46 in 345f2ae
Possibly, the delayed enrollment and fleet gateway checkins should use the same backoff algorithm since they are both critical operations that reach out to Fleet Server indefinitely.
The text was updated successfully, but these errors were encountered: