Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updates_do_not_block_continue_as_new #584

Merged
merged 6 commits into from
Jan 23, 2025
Merged

updates_do_not_block_continue_as_new #584

merged 6 commits into from
Jan 23, 2025

Conversation

dandavison
Copy link
Contributor

Adds a new feature that confirms the behavior when a client sends an update to a workflow that is continuing-as-new. Specifically, the test guarantees that the update is admitted while the WFT is in-flight. This confirms that:

  1. Updates do not block continue as new (unlike signals)
  2. An update that is admitted while the workflow is continuing as new lands on the post-CAN workflow run.

Only implemented for Python so far; let's confirm the approach before extending implementation to other languages.

@dandavison dandavison force-pushed the can-update branch 3 times, most recently from 3f05068 to 95386da Compare January 21, 2025 00:05
async def _check_result() -> None:
# See docstring at top of file.
# Cause an update to be admitted while the first WFT is in progress
first_run_wft_is_in_progress.wait()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a thread-blocking call in asyncio. Can you use https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.loop.run_in_executor? Would also request on the in-workflow wait but that makes a bit less sense since blocking the workflow event loop is the goal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a thread-blocking call in asyncio.

Yes that's deliberate, could you read the docstring at the top of the file and let me know if you agree with the strategy?

Copy link
Contributor Author

@dandavison dandavison Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither run_in_executor nor wait_condition can be used here since the purpose is to block the event loop in order to hold up WFT processing.

EDIT: Ignore, on re-reading I see you were explicitly not referring to the thread-blocking in workflow code.

Copy link
Member

@cretz cretz Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
first_run_wft_is_in_progress.wait()
asyncio.get_running_loop().run_in_executor(None, first_run_wft_is_in_progress.wait)

^ That is the safe way to make a thread-blocking call like this. You don't want to block the primary event loop (blocking the workflow's event loop is different). Blocking the primary event loop prevents all other asyncio systems in the process from running.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the code is doing this correctly already. This entire coroutine is run in a new event loop in a new thread. See line 93 below:

await asyncio.to_thread(lambda: asyncio.run(_check_result()))

Copy link
Member

@cretz cretz Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see, then I think this can bring up a different issue: our client is not safe for running in separate event loops (because it is not safe for running in separate threads)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. Modified to just do the thread-blocking synchronization itself in a new thread, using asyncio.to_thread:

await asyncio.to_thread(first_run_wft_is_in_progress.wait)

update_has_been_admitted.set()
# The workflow will now CAN. Wait for the update result
update_run_id = await update_task
# The update should have been handled on the post-CAN run.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be worth two extra assertions: 1) that the update event is not in the history of the pre-CAN run, and 2) that the update is in the history of the post-CAN run

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, added these.

@dandavison dandavison force-pushed the can-update branch 2 times, most recently from b327842 to e00988f Compare January 23, 2025 13:42
@dandavison dandavison requested a review from cretz January 23, 2025 13:46
Copy link
Member

@cretz cretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do fear with single-language features there is no incentive or even reminder to do these in other languages. I wonder if we should track the work anywhere.

@dandavison
Copy link
Contributor Author

I do fear with single-language features there is no incentive or even reminder to do these in other languages.

It's not clear that this should be a feature test at all, so didn't seem any point in doing it in multiple langauges. If the retries are done by Frontend / History gRPC interceptors / clients as they are supposed to be then, while it's fine to add it as an integration test in SDKs, it's unclear that there's a need for a feature test across SDKs.

I wonder if we should track the work anywhere.

One possibility would be to set up a nag of some sort that finds feature tests with less than the full complement of SDKs.

@dandavison dandavison merged commit 45af634 into main Jan 23, 2025
21 checks passed
@dandavison dandavison deleted the can-update branch January 23, 2025 22:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants