-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: workflow builtin actions honoring on_success handlers #3070
feat: workflow builtin actions honoring on_success handlers #3070
Conversation
Codecov ReportAttention: Patch coverage is Additional details and impacted files📢 Thoughts on this report? Let us know! |
Robot Results
|
3b151c3
to
d3ee4ca
Compare
4ec89b9
to
e7591bd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have checked and runned flake-finder, looks ok
/// Trigger a built-in operation | ||
/// | ||
/// ```toml | ||
/// action = "<builtin-operation-name>" | ||
/// on_exec = "<state>" | ||
/// ``` | ||
BuiltInAction(OperationName, ExecHandlers), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Afterthought, I see that as over engineering.
The motivation was to be explicit:
- With
action = "builtin"
both the operation and the step are given by the context.- the operation is the operation of the workflow itself
- the step is derived from the current state of the command (i.e scheduled or executing)
- With
action = "<builtin-operation-name>"
and `action = "await " both the operation and the step are explicitaction = "<builtin-operation-name>"
triggers the scheduled step of the given operation- `action = "await " triggers the execution step
But this raises other issues:
- It might then be confusing to be able to invoke an operation either directly (with
action = software_update
) or using a sub-workflow (withoperation = software_update
). These are meaningful and even useful differences, but this comes with complications notably to guide users. - What if the user try to invoke the builtin behavior of operation A in a workflow for B? This could work but would requires extra work to handle the command payload, making even greater the confusion between builtin actions and sub operations.
- => In practice, having the operation implied by the workflow is a good thing, simple and effective.
Hence, I'm considering to revert this change. i.e. to only keep the action = "builtin"
case.
Another appealing alternative is the following:
- Keep the
action = "builtin"
as of now (honoring the user-provided handlers) - Introduce two slight variants, to make the action clearer as it's not obvious of what is done by the
builtin
action depends on the state name (scheduled
orexecuting
). - I propose:
action = "execute"
for thescheduled
step (which can then be named differently). - and
action = "await"
for theexecuting
step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see one additional problem with the approach tying builtin
action to their respective states, that might make some future enhancements harder. It's problematic for certain operations like config_update
, where the executing
state does multiple things as follows
- Send the
executing
status update - Download the config file
- Apply the updated config file
Now, with the proposed change, we can add additional logic before or after these 3 steps, but nothing in between. For e.g: some customer might want to validate the downloaded artefact before it is applied. So, I'm actually in favour of providing multiple smaller granular actions
like download
, file-copy
, etc which can be tied to any state as the user wants. Having some re-usable granular built-in actions would enable those to be used from any state in any workflow as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here a new proposal #3014 (comment).
I will address it in a different PR so we will be in position to compare and choose.
17a57ff
to
de47052
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit out of my depth on the design front (I should probably keep track of custom workflow developments more closely), but the Rust side overall looks alright, I just have one question regarding the interaction between await-...
and executing status.
/// Rewrite a command state before pushing it to a builtin operation actor | ||
/// | ||
/// Depending the action is to trigger or await the operation, | ||
/// set the status to schedule or executing. | ||
/// | ||
/// Return the command state unchanged if there is no appropriate substitute. | ||
pub fn adapt_builtin_request(&self, command_state: GenericCommandState) -> GenericCommandState { | ||
match self { | ||
OperationAction::BuiltInAction(_, _) => { | ||
command_state.update(GenericStateUpdate::scheduled()) | ||
} | ||
OperationAction::AwaitBuiltInAction(_, _) => { | ||
command_state.update(GenericStateUpdate::executing()) | ||
} | ||
_ => command_state, | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought: it's nonobvious how setting a scheduled state leads to an operation being spawned, execution state leads to it being awaited, and whether or not the state being set is for current command or a different one (sub-command)
At the beginning I was a bit confused why <builtin-operation-name>
moves to scheduled and await-<builtin-operation-name>
moves to scheduling, but, as the comment says, if idea is for builtin-*
to trigger an operation and await-builtin-*
to await the result, then it's not clear to me how this is eventually achieved.
And particularly, when we're doing a sub-workflow execution, e.g. an example from the documentation:
[trigger_config_update]
operation = "config_update"
input.tedgeUrl = "http://127.0.0.1:8000/tedge/file-transfer/example/config_update/mosquitto-1234"
input.type = "mosquitto"
on_exec = "waiting_for_config_update"
[waiting_for_config_update]
action = "await-operation-completion"
timeout_second = 600
on_timeout = "timeout_config_update"
on_success = "successful_config_update"
on_error = { status = "failed", reason = "fail to update the config"}
it's relatively clear that action = "await-...
below is waiting for the entire config update to complete, but when applied to the lower level of states, I have trouble understanding why that's necessary for the executing state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's relatively clear that action = "await-... below is waiting for the entire config update to complete, but when applied to the lower level of states, I have trouble understanding why that's necessary for the executing state.
It seems like there is an implicit assumption that scheduling
state is handled by other agents trivially and immediately, so we model it as moving immediately to executing
state, and we also assume that executing
is a potentially long-running process that we need to await the completion of, but the question is, is this distinction necessary.
In any case, I have now a bit better idea of the flow after looking at adapt_builtin_request
and adapt_builtin_response
methods:
[scheduled]
action = "software_update"
on_exec = "executing"
[executing]
action = "await_software_update"
on_success = "postprocess"
on_error = "rollback"
- workflow is in
scheduled
state, builtin operation actor is notified of that - builtin operation actor moves to
scheduled
state - builtin operation actor moves to
executing
state, tells workflow to runon_exec
handler - workflow moves to
executing
state, builtin operation actor is notified of that - builtin operation actor moves to
successful
orfailed
state, tells workflow to runon_success
oron_error
handlers - workflow moes to
successful
orfailed
state - workflow is terminated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought: it's nonobvious how setting a scheduled state leads to an operation being spawned, execution state leads to it being awaited, and whether or not the state being set is for current command or a different one (sub-command)
In both case, the workflow engine delegates the action to the actor registered for that builtin operation.
The response is sent back by the operation specific actor to the workflow actor for further processing.
This commit improves method name as they were not really clear,
It seems like there is an implicit assumption that
scheduling
state is handled by other agents trivially and immediately, so we model it as moving immediately toexecuting
state, and we also assume thatexecuting
is a potentially long-running process that we need to await the completion of, but the question is, is this distinction necessary.
There is indeed an assumption that all the builtin operator actors follow the same workflow, transitioning from scheduled to executing and then either to successful or failed. However, there is no assumption on the timing. Technically, in both cases (scheduled and executing), the workflow engine awaits for the transition to occur.
Is this distinction necessary? From a workflow perspective, yes, as this makes explicit the transitions (todo - doing - done) and clarifies error management. This is notably a key point for the restart operation, where the device reboots between the two states. Internally, for the builtin actors, no. This is a legacy of when these actors where directly exposed to MQTT notifying themselves all the state transitions.
In any case, I have now a bit better idea of the flow after looking at
adapt_builtin_request
andadapt_builtin_response
methods:
[scheduled] action = "software_update" on_exec = "executing" [executing] action = "await_software_update" on_success = "postprocess" on_error = "rollback"
- workflow is in
scheduled
state, builtin operation actor is notified of that- builtin operation actor moves to
scheduled
state- builtin operation actor moves to
executing
state, tells workflow to runon_exec
handler- workflow moves to
executing
state, builtin operation actor is notified of that- builtin operation actor moves to
successful
orfailed
state, tells workflow to runon_success
oron_error
handlers- workflow moes to
successful
orfailed
state- workflow is terminated
Your understanding is correct.
This commit introduces no changes to the user. This is only a preparation step to properly use the exit handlers provided by the user for a builtin command. - Distinguish `BuiltIn` from `AwaitBuiltIn` actions - Make the name of the operation explicit for `BuiltIn` and `AwaitBuiltIn` actions - Equip the `BuiltIn` action with an `on_exec` handler. - Equip the ``AwaitBuiltIn` action with `on_success` and `on_error` handlers. Signed-off-by: Didier Wenzek <[email protected]>
Signed-off-by: Didier Wenzek <[email protected]>
Signed-off-by: Didier Wenzek <[email protected]>
The new name is more appropriate, now that these handlers are used not only for background scripts but also to trigger a sub operation or a builtin action. Signed-off-by: Didier Wenzek <[email protected]>
Signed-off-by: Didier Wenzek <[email protected]>
cdfdde9
to
e891f6c
Compare
Signed-off-by: Didier Wenzek <[email protected]>
e891f6c
to
08cd735
Compare
Closing as superseded by #3105 |
Proposed changes
Improve the workflow builtin actions with explicit operation names and the ability to define specific state on success and on error, as well as for the executing state.
BuiltIn
fromAwaitBuiltIn
actionsBuiltIn
andAwaitBuiltIn
actionsBuiltIn
action with anon_exec
handler.AwaitBuiltIn
action withon_success
andon_error
handlers.builtin
keywordbuiltin
keyword can still be used for backward compatibilitybuiltin:<op>
can be triggered on any state.Types of changes
Paste Link to the issue
Checklist
cargo fmt
as mentioned in CODING_GUIDELINEScargo clippy
as mentioned in CODING_GUIDELINESFurther comments