You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In AWX, we receive events from ansible-runner and save them to the database. When applicable, those events are linked to its corresponding Host record. This creates a problem because a database relational field requires the primary key of the object, and ansible-runner only provides the host name.
Current solution
Right now we build a host_map variable locally when we write the inventory file (before we start the job).
Why is this non-ideal? Because the host_map variable is potentially very large, as we expect inventories of ~50,000 hosts in real-world situations. More importantly, the lifetime of this variable must persist for as long as events are generated (until the end of the job).
Why now? Because a great deal of wanted architectural changes dictate that we do the ansible-runner process step independently of the other steps (like transmit). This means that we want to avoid keeping large long-lived variables in memory as we consume these events.
Proposed solution
The host id is not particularly challenging to find, an we already set it automatically on every host in the inventory using the remote_host_id variable.
This proposal is to allow an additional configuration to add a host variable into the event_data. Then everywhere the callback obtains the host name, it also obtains that variable (if it exists) in the host variables. This will allow us to much more easily replay job events starting from some non-zero line number, which is needed to recover from restarts without losing jobs.
The text was updated successfully, but these errors were encountered:
In AWX, we receive events from ansible-runner and save them to the database. When applicable, those events are linked to its corresponding
Host
record. This creates a problem because a database relational field requires the primary key of the object, and ansible-runner only provides the hostname
.Current solution
Right now we build a
host_map
variable locally when we write the inventory file (before we start the job).https://github.com/ansible/awx/blob/d89cad0d9edd2baaa01f668d8ed12eca62ee1a48/awx/main/tasks/jobs.py#L318
Why is this non-ideal? Because the
host_map
variable is potentially very large, as we expect inventories of ~50,000 hosts in real-world situations. More importantly, the lifetime of this variable must persist for as long as events are generated (until the end of the job).Why now? Because a great deal of wanted architectural changes dictate that we do the
ansible-runner process
step independently of the other steps (like transmit). This means that we want to avoid keeping large long-lived variables in memory as we consume these events.Proposed solution
The host id is not particularly challenging to find, an we already set it automatically on every host in the inventory using the
remote_host_id
variable.https://github.com/ansible/awx/blob/d89cad0d9edd2baaa01f668d8ed12eca62ee1a48/awx/main/models/inventory.py#L373
If you look in the callback, there are many places where we add the host name to the event data using
result._host.get_name()
.ansible-runner/src/ansible_runner/display_callback/callback/awx_display.py
Line 642 in 85b3403
This proposal is to allow an additional configuration to add a host variable into the event_data. Then everywhere the callback obtains the host name, it also obtains that variable (if it exists) in the host variables. This will allow us to much more easily replay job events starting from some non-zero line number, which is needed to recover from restarts without losing jobs.
The text was updated successfully, but these errors were encountered: