Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[KEDA][AzureEventHub] App not scaling to zero #1225

Closed
1 of 3 tasks
goncalo-oliveira opened this issue Jul 10, 2024 · 6 comments
Closed
1 of 3 tasks

[KEDA][AzureEventHub] App not scaling to zero #1225

goncalo-oliveira opened this issue Jul 10, 2024 · 6 comments
Labels
Backlog Issue has been validated and logged in our backlog for future work

Comments

@goncalo-oliveira
Copy link

This issue is a:

  • bug report -> please search issues before submitting
  • documentation issue or request
  • regression (a behavior that used to work and stopped in a new release)

Issue description

I have set up a container app with 0 - 4 replicas, using azure-eventhub scale rule with the following settings

- type:
  metadata:
    activationUnprocessedEventThreshold: 10
    checkpointStrategy: blobMetadata
    blobContainer: ...
    connectionFromEnv: ...
    consumerGroup: ...
    eventHubNameFromEnv: ...
    storageConnectionFromEnv: ...
    unprocessedEventThreshold: 64

I did have an earlier issue, which I commented on #972, where the app didn't scale down, but that was down to the missing checkpointStrategy which was updated. It now scales up and down to 1, depending on the load. However, it does not scale down to zero when there are no events to process.

I was thinking that it might be intended behaviour, but from the documentation, it seems that it should scale to zero and activate at a threshold

activationUnprocessedEventThreshold - Target value for activating the scaler. Learn more about activation here.

Steps to reproduce

  1. Set up container app with minReplicas: 0 and a positive maxReplicas value
  2. Set up scale rules as described above
  3. Maintain a period without event intake

Expected behavior

Without any events being sent to the Event Hub, the expectation was for the app to scale to zero. When the unprocessed messages hits the activationUnprocessedEventThreshold, the app scales back to 1.

Actual behavior

Without any events being sent to the Event Hub, the number of replicas is kept at 1.

Screenshots

image image
@microsoft-github-policy-service microsoft-github-policy-service bot added the Needs: triage 🔍 Pending a first pass to read, tag, and assign label Jul 10, 2024
@simonjj
Copy link
Collaborator

simonjj commented Jul 10, 2024

@goncalo-oliveira thank you for reaching out. Does this application have external ingress enabled and is also serving HTTP requests?

@simonjj simonjj removed the Needs: triage 🔍 Pending a first pass to read, tag, and assign label Jul 10, 2024
@goncalo-oliveira
Copy link
Author

goncalo-oliveira commented Jul 10, 2024

Hi @simonjj, thanks for replying. Yes, as a matter of fact, the app does have ingress enabled so that prometheus metrics can be scraped off by another app, although the ingress traffic is limited to the container app environment. Right... as I'm writing this... the scraping is actually keeping the app alive...

@goncalo-oliveira
Copy link
Author

Just to confirm, I've removed the ingress from the app and it did scale down to zero after some time, as expected. However, the app did not scale back up when new events were added, which was unexpected. Reactivating the ingress woke the app again and scaled up to 2 replicas, possible because of the number of events waiting and after some time, it scaled down to 1, after the number of events normalized (all as expected).

Am I missing something here? I do have activationUnprocessedEventThreshold: 10 set in the configuration, so the expected outcome was for the app to activate itself when events were waiting.

It's not the end of the world, since for my particular use case, in production, I am not expecting the number of events to be zero, ever, but it's still an unexpected behaviour.

@anthonychu
Copy link
Member

@goncalo-oliveira Can you please check your system logs to see if there are any scaling/KEDA related error messages? It's possible that the Event Hubs scale rule isn't correctly configured and the only thing scaling the app is the HTTP rule.

@goncalo-oliveira
Copy link
Author

I guess that I'm proving myself wrong and that things are working as expected - which is a good thing. Can't say why it didn't work first, maybe I did have something not properly configured.

{"Msg":"Deactivated apps/v1.Deployment k8se-apps/app-***** from 1 to 0","Reason":"KEDAScaleTargetDeactivated","EventSource":"KEDA","Count":273}
{"Msg":"Container \u0027consumer\u0027 was terminated with exit code \u0027\u0027 and reason \u0027ManuallyStopped\u0027","Reason":"ContainerTerminated","EventSource":"ContainerAppController","Count":1}
{"Msg":"Scaled apps/v1.Deployment k8se-apps/app-***** from 0 to 1, triggered by throughput","Reason":"KEDAScaleTargetActivated","EventSource":"KEDA","Count":1}
{"Msg":"Replica \u0027app-*****-76c4877f9f-cf7sc\u0027 has been scheduled to run on a node.","Reason":"AssigningReplica","EventSource":"ContainerAppController","Count":0}
{"Msg":"Pulling image \u0027********\u0027","Reason":"PullingImage","EventSource":"ContainerAppController","Count":1}
{"Msg":"Successfully pulled image \u0027********\\u0027 in 1.9588958s","Reason":"PulledImage","EventSource":"ContainerAppController","Count":1}
{"Msg":"Created container \u0027consumer\u0027","Reason":"ContainerCreated","EventSource":"ContainerAppController","Count":1}
{"Msg":"Started container \u0027consumer\u0027","Reason":"ContainerStarted","EventSource":"ContainerAppController","Count":1}

Thank you both, @simonjj and @anthonychu.

@simonjj simonjj added Backlog Issue has been validated and logged in our backlog for future work and removed Needs: Attention 👋 labels Jul 17, 2024
@simonjj
Copy link
Collaborator

simonjj commented Jul 17, 2024

Thanks for clarifying and getting back to us. We will add an item to improve the KEDA error messaging.

@simonjj simonjj closed this as completed Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Backlog Issue has been validated and logged in our backlog for future work
Projects
None yet
Development

No branches or pull requests

3 participants