Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race condition when deleting schedules #15259

Merged
merged 2 commits into from
Jun 10, 2024

Conversation

fosterseth
Copy link
Member

@fosterseth fosterseth commented Jun 6, 2024

SUMMARY

issue #9548

If more than one schedule for a unified job template is removed at once, a race condition can arise.

example scenario: delete schedules with ids 7 and 8

  • unified job template next_schedule is currently 7
  • on delete of schedule 7, update_computed_fields will try to set next_schedule to 8
  • but while this logic is occurring, another transaction is deleting 8

This leads to a db IntegrityError

The solution here is to call select_for_update() on the next schedule, so that 8 cannot be deleted until the transaction for deleting 7 is completed.

ISSUE TYPE
  • Bug, Docs Fix or other nominal change
COMPONENT NAME
  • API

If more than one schedule for a unified job template
is removed at once, a race condition can arise.

example scenario:  delete schedules with ids 7 and 8
- unified job template next_schedule is currently 7
- on delete of schedule 7, update_computed_fields will try to set
next_schedule to 8
- but while this logic is occurring, another transaction
is deleting 8

This leads to a db IntegrityError

The solution here is to call select_for_update() on the
next schedule, so that 8 cannot be deleted until
the transaction for deleting 7 is completed.

Signed-off-by: Seth Foster <[email protected]>
Copy link
Member

@AlanCoding AlanCoding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two issues:

  1. Don't select_for_update until after next_schedule is set. Because update_computed_fields is expected to be called frequently without regard for the performance issues this will cause, which would be intense with your changes as-is.
  2. If you do this in auto-commit mode it throws error

TransactionManagementError: select_for_update cannot be used outside of a transaction.

What you wrote will work for requests but is highly likely to break random code that runs as tasks / commands. I don't know how you will solve this, but you'll figure it out. Just paste some manual verification because it can't be verified in our (transactional) tests.

@fosterseth
Copy link
Member Author

fosterseth commented Jun 7, 2024

@AlanCoding I think I addressed your concerns, let me know

select_for_update works on object managers, not objects, so we have to re-query here
self.next_schedule = related_schedules.select_for_update().first()

This is after the logic of checking whether we need to save next_schedule, so should be infrequently done anyways.

in autocommit mode, works just fine (for this test, I added a log statement to highlight that I'm in autocommit mode)

In [3]: src.update_computed_fields()
2024-06-07 16:11:43,678 WARNING  [-] awx.main.models.unified_jobs autocommit
2024-06-07 16:11:43,715 DEBUG    [-] awx.main.models.inventory Going to update inventory computed fields, pk=2
2024-06-07 16:11:43,723 DEBUG    [-] awx.main.models.inventory Finished updating inventory computed fields, pk=2, in 0.008 seconds

@fosterseth fosterseth requested a review from AlanCoding June 7, 2024 16:20
@AlanCoding
Copy link
Member

select_for_update works on object managers, not objects, so we have to re-query here

Yeah, expected.

Copy link
Member

@AlanCoding AlanCoding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't verified as fix for the issue, but I expect it to work.

@fosterseth fosterseth merged commit d65ea2a into ansible:devel Jun 10, 2024
21 checks passed
djyasin pushed a commit to djyasin/awx that referenced this pull request Sep 16, 2024
If more than one schedule for a unified job template
is removed at once, a race condition can arise.

example scenario:  delete schedules with ids 7 and 8
- unified job template next_schedule is currently 7
- on delete of schedule 7, update_computed_fields will try to set
next_schedule to 8
- but while this logic is occurring, another transaction
is deleting 8

This leads to a db IntegrityError

The solution here is to call select_for_update() on the
next schedule, so that 8 cannot be deleted until
the transaction for deleting 7 is completed.

Signed-off-by: Seth Foster <[email protected]>
djyasin pushed a commit to djyasin/awx that referenced this pull request Nov 11, 2024
If more than one schedule for a unified job template
is removed at once, a race condition can arise.

example scenario:  delete schedules with ids 7 and 8
- unified job template next_schedule is currently 7
- on delete of schedule 7, update_computed_fields will try to set
next_schedule to 8
- but while this logic is occurring, another transaction
is deleting 8

This leads to a db IntegrityError

The solution here is to call select_for_update() on the
next schedule, so that 8 cannot be deleted until
the transaction for deleting 7 is completed.

Signed-off-by: Seth Foster <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants