Workflow init decorator #645

dandavison · 2024-09-18T15:14:11Z

Closes #600
Supersedes #634

This PR makes it possible to define an __init__ method on a workflow class such that the workflow input is passed to __init__ (in exactly the same way as it is passed to the @workflow.run method, i.e. as one unnamed positional argument per workflow input payload).

Users make use of this feature by using a decorator. Here's a simple example:

@workflow.defn
class WorkflowRunSeesWorkflowInitWorkflow:
    @workflow.init
    def __init__(self, arg: str) -> None:
        self.value = arg

    @workflow.run
    async def run(self, arg: str):
        return f"hello, {self.value}"

However, that example does not show the real purpose of this feature, which is to allow update and signal handlers to access the workflow input, even if they happen to execute in the first workflow task (in which case they will start executing before the main workflow method). Previously, this would have required the handlers to use workflow.wait_condition. See the test test_update_in_first_wft_sees_workflow_init added in this PR.

This PR supersedes #634. That PR attempted to introduce this feature without introducing a decorator, i.e., by inspection of the method signature. However, we have decided that that approach could lead to surprising behavior, especially in situations involving workflow class inheritance.

cretz · 2024-09-23T14:23:24Z

temporalio/worker/_workflow_instance.py

+    ) -> None:
+        self._random.seed(job.randomness_seed)
+
+    def _make_workflow_input(


Is there value in separating out this single-use method? I don't mind if just for clarity, just confirming there is not another reason

Yes, I strongly object to code styles that fail to break out helper functions. It doesn't matter at all to me whether they are used once or multiple times at this point in the codebase. Some reasons (could write much more...):

Abstraction: if there are (say) 10 or more lines within a function that create their own independent subset of local function state with the sole aim of computing a value, then it is much easier to read and reason about if that is broken out as a helper function. There is no need for the reader to understand the computation if they don't want to; it's one of the main reasons functions exist. As an extra benefit, the helper function is an opportunity to choose a good descriptive name for the unit of functionality. It also gives a target for unit testing.

Even if it were desirable as an alternative to a helper functions (I don't think it is) Python doesn't have block-level scope; it's not possible to determine how the local state interacts with other parts of the function, without reading all the implementation detail closely and mentally tracking scope / lifetime of each variable.

Python has a strong and long tradition of being writing as a collection of small functions; I think this is due to its long history of having no static typing; even moreso without static typing, functions have to be small to give readers and writers a chance of avoiding bugs.

https://martinfowler.com/bliki/FunctionLength.html is somewhat similar to my view.

Reasons against:

Some people prefer reading large amounts of implementation detail in a single sequential pass and don't mind paying for that by tracking very complex function state in their brains, state that can trivially be decomposed into independent modules.

All good, no strong preference. For the same reason people have a hard time reading many lines in a method, they have a hard time reading many methods in a class. I just want to make sure we strike a good balance and not have a ton of small, single-use methods littering the class (as opposed to normal reusable methods).

Yeah I think it's the great often forgotten-about religious war in programming :)

For the same reason people have a hard time reading many lines in a method, they have a hard time reading many methods in a class.

I don't agree that there's an equivalence, but I agree that the scope of helper functions has to be tastefully chosen, and that the author has to be very good at choosing function names. With those two things in hand, helper functions basically only make it easier to read. The point is that you can choose how deep into implementation detail you wish to go, whereas without helper functions you cannot choose.

For myself, I struggle hopping around amongst disparately placed logic compared to scrolling (but still don't want to scroll too much of course)

Right, so the question is: why do you feel the need to hop to the implementation? A helper function has a return type, and a well-chosen name. That should be all that you need to understand the implementation at the call site: you don't need to see the implementation detail of the helper function. Is the issue that you don't trust methods to not mutate instance state? If so, that implies that you'd be happier with helper functions (with immutable parameters), just not so much with methods (in Python).

But the issue of mutating state is where helper functions bring a big advantage: if you include unrolled implementation logic at the call site then in most languages it's very hard to tell whether the implementation is mutating local state, whereas a helper function makes this clearer or even guaranteed (i.e. in Python, pass immutable arguments; in other languages pass non-writable references, etc).

In general, IMO, complex logic should be written as a sequence of function calls. For example, this should be preferred over an unrolled version, even if the functions are called once only.

artifact_globs = cli.options.artifact_globs or ["*"] remote_artifacts = _list_remote_artifacts(repos, artifact_globs) artifacts_to_download = _get_artifacts_not_in_db(db, remote_artifacts) downloaded_artifacts = _download_zip_artifacts(artifacts_to_download) rows = _parse_xml_from_zip_artifacts(downloaded_artifacts) rows = _fetch_pr_info(async_iterator_to_list(rows)) db.insert_rows(rows)

cretz · 2024-09-23T14:27:19Z

temporalio/workflow.py

+
+    Note that the ``@workflow.run`` method is always called as
+    ``my_workflow.my_run_method(**workflow_input_args)``. Therefore, if you use
+    the ``@workflow.init`` decorator, the parameter list of your  __init__ and


the parameter list of your init and @workflow.run methods will usually be identical.

I think we should require/enforce the params are identical if this decorator is present and the run method is present

OK. Yes, I was tentatively thinking of not having a contract without any static (decorator-time) constraints:

If you use the decorator, then it is your responsibility to ensure that your init signature is such that it will accept the arguments passed at workflow-run time. As long as that is true, the parameter signatures of the two methods can be whatever Python allows.

But I can't think of a clear counter-example where the stricter contract would make desirable code invalid, whereas enforcing identical params at decorator-eval time has the benefit of catching incorrect code at import time, and also that it can be relaxed in the future if we want to (whereas the looser contract cannot easily be made stricter).

I do think we want to allow names to differ, e.g. for this scenario where an underscore (perhaps required by a linter or styleguide) is used to mark an unused parameter.

@workflow.defn class MyWorkflow: @workflow.init def __init__(self, arg: str) -> None: self.value = arg @workflow.run async def run(self, _arg: str) -> str: return self.value

d8b5dff

cretz

LGTM. Would recommend updating the README (ideally we would get rid of README and only have docs, but that's a separate project I think).

dandavison added 10 commits September 17, 2024 16:00

Refactor to support workflow init

b677bda

Implement workflow __init__ support

0b33593

Handle non-slash form __init__(self, *args, **kwargs)

a68911e

Simplify and fix implementation

3573c0b

Add (fatally) failing test case

5fda934

Defer decision until workflow-start time

4f7a0e0

s/set/used/

e9e2af4

decorator

a739d50

Update test

16395b4

Tweak docstring

84958a6

dandavison requested a review from a team as a code owner September 18, 2024 15:14

dandavison mentioned this pull request Sep 18, 2024

Workflow init #634

Closed

Change update method inheritance for mypy

1f81d84

cretz reviewed Sep 23, 2024

View reviewed changes

Require identical parameters

d8b5dff

cretz approved these changes Sep 23, 2024

View reviewed changes

dandavison added 5 commits September 23, 2024 17:03

Update docstring

0557b28

Add unit test of _parameters_identical_up_to_naming

0c10888

Document @workflow.init

95f0630

Test that @workflow.init may only be used on __init__

d87d7b1

Lint

fcf2d7b

dandavison merged commit 0995ae0 into main Sep 24, 2024
12 checks passed

dandavison deleted the workflow-init-decorator branch September 24, 2024 16:46

dandavison mentioned this pull request Sep 25, 2024

Document workflow.init for Python temporalio/documentation#3108

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Workflow init decorator #645

Workflow init decorator #645

dandavison commented Sep 18, 2024 •

edited

Loading

cretz Sep 23, 2024

dandavison Sep 23, 2024 •

edited

Loading

cretz Sep 23, 2024

dandavison Sep 23, 2024 •

edited

Loading

cretz Sep 24, 2024 •

edited

Loading

dandavison Sep 24, 2024 •

edited

Loading

cretz Sep 23, 2024

dandavison Sep 23, 2024 •

edited

Loading

cretz left a comment

Workflow init decorator #645

Workflow init decorator #645

Conversation

dandavison commented Sep 18, 2024 • edited Loading

cretz Sep 23, 2024

Choose a reason for hiding this comment

dandavison Sep 23, 2024 • edited Loading

Choose a reason for hiding this comment

cretz Sep 23, 2024

Choose a reason for hiding this comment

dandavison Sep 23, 2024 • edited Loading

Choose a reason for hiding this comment

cretz Sep 24, 2024 • edited Loading

Choose a reason for hiding this comment

dandavison Sep 24, 2024 • edited Loading

Choose a reason for hiding this comment

cretz Sep 23, 2024

Choose a reason for hiding this comment

dandavison Sep 23, 2024 • edited Loading

Choose a reason for hiding this comment

cretz left a comment

Choose a reason for hiding this comment

dandavison commented Sep 18, 2024 •

edited

Loading

dandavison Sep 23, 2024 •

edited

Loading

dandavison Sep 23, 2024 •

edited

Loading

cretz Sep 24, 2024 •

edited

Loading

dandavison Sep 24, 2024 •

edited

Loading

dandavison Sep 23, 2024 •

edited

Loading