Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWX live job output stops updating and/or gets disconnected #15342

Open
5 of 11 tasks
parkerfath opened this issue Jul 8, 2024 · 1 comment
Open
5 of 11 tasks

AWX live job output stops updating and/or gets disconnected #15342

parkerfath opened this issue Jul 8, 2024 · 1 comment

Comments

@parkerfath
Copy link

Please confirm the following

  • I agree to follow this project's code of conduct.
  • I have checked the current issues for duplicates.
  • I understand that AWX is open source software provided for free and that I might not receive a timely response.
  • I am NOT reporting a (potential) security vulnerability. (These should be emailed to [email protected] instead.)

Bug Summary

I’ve found that sometimes, in long-ish job runs, say 10-20 minutes or longer, with about 200 hosts, the log output in the “Output” tab of the job that’s currently running will stop updating. When I look at the pod logs for the automation-job pod, the job is still running and logging, but the AWX UI is not updating with the new logs. This makes it seem like the job is stuck.

Often, when the job completes, the log data will load into the textarea, but while it’s running it’s stuck.

Initially saw this in AWX 22.3.0, running in Google Kubernetes Engine (GKE). Upgraded to 24.6.1 this week and still seeing the issue.

Note: this doesn't happen every time, but by my guesstimation, I'd say it's 20-30% of the time with these longer-running jobs.

See also https://forum.ansible.com/t/awx-live-job-output-stops-updating-gets-disconnected/2936/1

AWX version

24.6.1

Select the relevant components

  • UI
  • UI (tech preview)
  • API
  • Docs
  • Collection
  • CLI
  • Other

Installation method

kubernetes

Modifications

no

Ansible version

No response

Operating system

No response

Web browser

Chrome

Steps to reproduce

  1. Create inventory with many hosts and/or job template that runs for > 10 minutes (not sure which of these conditions are necessary).
  2. Watch job output and click "Follow"
  3. Wait for output

Expected results

Output is updated in real time as job proceeds

Actual results

UI stops updating, but I can look in the pod logs for automation-job and see that it's still running. It seems to eventually "reconnect" and start updating again, but there will be 10+ minute gaps where I don't see any updates and need to use an external app to check kube pod logs. Refreshing the UI (with browser refresh button) does not help; it will still show the output stuck at the exact same spot.

Additional information

No response

@thedoubl3j
Copy link
Member

thedoubl3j commented Jul 10, 2024

@parkerfath we are aware of the django channels issue with bringing log data back to the ui. thanks for reporting, we will keep this open for aggregation reasons/ping you for testing a new feature or fix once we have one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants