Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow definitions are not backward compatible #3077

Open
didier-wenzek opened this issue Aug 21, 2024 · 2 comments
Open

Workflow definitions are not backward compatible #3077

didier-wenzek opened this issue Aug 21, 2024 · 2 comments
Labels

Comments

@didier-wenzek
Copy link
Contributor

Describe the bug

Updating thin-edge on a device breaks any workflow which definition contains deprecated definition.

Notably, the restart action has been deprecated with the introduction of sub-operation workflows
and this change has been reported in the rugpi firmware update workflow:

[restart]
- action = "restart"
- on_exec = "restarting"  # Internal state used by the "restart" action
+ operation = "restart"
+ on_exec = "await_restart"
+
+ [await_restart]
+ action = "await-operation-completion"
on_success = "restarted"
on_error = { status = "failed_restart", reason = "Failed to restart device" }

Then updating tedge-agent on a device running a firmware that is older than this change, will break firmware updates.
Subsequent firmware updates will stay in the init state for ever, being ignored by the update agent which ignores the old syntax.

To Reproduce

  • Install a device with a rugpi firmware built before this commit
  • Check that the device is running tedge version 1.0.1
  • Update tedge to version 1.1.0 or higher.
  • Trigger from Cumulocity a firmware update
  • Check that a new firmware command has been created on the device
    • Using tedge mqtt sub te/device/main///cmd/firmware_update/#
  • Observe that the firmware update is stuck:
    • Not moving to the executing state in Cumulocity
    • The retained message of the command published on te/device/main///cmd/firmware_update/# is not updated.

Expected behavior

The firmware command should proceed from init to scheduled, executing ... as defined by the /etc/tedge/operations/firmware_update.toml workflow.

Upgrading thin-edge must not invalidate previously valid operation workflows

Screenshots

Environment (please complete the following information):

  • OS [incl. version]
  • Hardware [incl. revision]
  • System-Architecture [e.g. result of "uname -a"]
  • thin-edge.io version [e.g. 0.1.0]

Additional context

@reubenmiller
Copy link
Contributor

Though just a comment, it is highly advised that when using a firmware image (e.g. using Rugpi or Yocto) that the thin-edge.io version is only updated via a firmware operation as the operation support rolling back incase of any unexpected errors (such as invalid workflows). I'm fairly confident that upgrading devices across thin-edge.io 1.0.1 -> 1.2.0 version is supported as the firmware_update.toml workflow is actually a symlink enabling the syntax to change across the firmware images (but I'll double check the update to be sure).

@didier-wenzek
Copy link
Contributor Author

I'm fairly confident that upgrading devices across thin-edge.io 1.0.1 -> 1.2.0 version is supported as the firmware_update.toml workflow is actually a symlink enabling the syntax to change across the firmware images (but I'll double check the update to be sure).

I'm confident of that too. The current issue has been observed in a case where thin-edge has been updated manually:

  1. Install firmware with thin-edge 1.01
  2. Upgrade thin-edge to 1.1.0 (this is the offending step)
  3. Upgrade to firmware with thin-edge 1.2.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants