Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Allow multiple Canary steps #4172

Open
Samze opened this issue Jan 14, 2025 · 0 comments
Open

Feature Request: Allow multiple Canary steps #4172

Samze opened this issue Jan 14, 2025 · 0 comments

Comments

@Samze
Copy link
Contributor

Samze commented Jan 14, 2025

Background

Canary deployments were added to CAPI in https://github.com/cloudfoundry/capi-release/releases/tag/1.188.0. This introduced the new deployment strategy canary with the following behaviour,

  1. An existing app is pushed and has 5 instances running v1.
  2. The app developer triggers a new canary deployment with v2 of the app. cf push myapp –strategy canary
  3. A new process is created with v2 of the app and a single canary instance is created.
  4. The deployment is paused
  5. The app developer verifies the deployment (by checking the canary logs for example).
  6. The app developer continues the deployment (e.g. cf continue-deployment) or cancels the deployment cf cancel-deployment.
    1. If continued, the canary deployment is promoted and proceeds equivalent to a rolling deployment, replacing all instances with v2 of the app.
    2. If cancelled, the canary deployment is rolled back to the previous version.

For more general information on Canary Deployments see the Configuring canary app deployments docs.


What and Why

We would like to introduce the ability for developers to define multiple canary steps, this would enable multiple checkpoints at different times during the deployment so that they can do more validation on the rollout before final promotion. We propose to introduce the configuration of steps.

API Changes

Configuring a Canary Deployment with steps

The Deployment Object

...
"strategy": "canary",
"options" : {
  "canary": { # new
      "steps": [
       { "instance_weight": 1 },
       { "instance_weight": 20 },
       { "instance_weight": 45 },
       { "instance_weight": 80 }, 
       { "instance_weight": 100 }     
    ]
  }
}

Notes:

  • Preferring steps being an object over an array for extensibility in the future. (e.g. default pauses/timeouts for steps)
  • Steps being an explicit configuration under canary. As steps as a configuration can not be applied for other deployment methods (rolling).
  • Explicit around instance weights to allow traffic weights in the future.
  • Validations include
    • canary is optional but must be an object
    • steps is optional but must be an array of objects
    • steps can be an empty array
    • instance_weight must be integer between 1-100

For an app with 10 instances, this would result in the follow rollout behaviour with 5 pauses.

Step # Step % Existing instances Canary instances % of traffic to canary
0 Pre 10 0 0
1 1 10 1 9%
2 20 9 2 18%
3 45 7 4 36%
4 80 3 8 72%
5 100 0 10 100%
6 Post 0 10 100% (promoted)

Obtaining Canary Deployment status with steps

  "status": {
    "value": "ACTIVE",
    "reason": "PAUSED",
    "canary": { # new
      "steps": {
        "current": 1
        "total": 4
      },
    }
  },

Notes:

  • New canary object to contain canary specific status information, mirroring the canary options.canary object to configure steps.
  • Explicit fields for current/total rather than a single string for machine readability.
  • When .status.value is PAUSED. .status.canary.steps.current represents the step that has been completed and waiting for continuation/cancellation.
  • When .status.value is DEPLOYING .status.canary.steps.current represents the step that is currently being converged toward.
  • We might extend this in the future to contain more detailed information such as number of target canary instances.

Other design details

Configuration of steps weights

Steps should be configured as a percentage of instances rather than an explicit instance value. This allows the deployment definition to be independent of the exact scaling of the instance.

Given percent steps may not align exactly with instance numbers the behaviour should be deterministic and well documented.

We propose the following rule:

Cloud Controller will match a step percent to the nearest non-zero instance number and will round down in the event of an being in the middle of two values.

Additional canary instance

During the canary rollout between steps we will always have an additional instance (e.g. step 2 above has 9 existing and 2 new). This is to keep the behaviour consistent with the current single step implementation which deploys a new additional canary instance. In the future we could make this configurable so it could be set to 0 or another value.

Multiple steps round to the same instance

If multiple steps round to the same instance, there will be no change in scaling behaviour however there will be multiple pauses. E.g. for 10 instances steps with the following configuration would scale Canary to 2 instances but still pause 4 times.

19, 20, 20, 21

What about traffic weighting?

In the future we should implement traffic based canary routing, e.g. 1% would route 1% of traffic to an instance.
This is common in other canary implementations. However, it is not possible to route traffic directly to canary instances. Traffic is round-robined evenly to all routable instances of an application. This would require a larger effort with collaboration with other CF components.

Possible CLI changes

CLI

cf push my-app --strategy=canary --max-in-flight=2 --instance-steps=[1,20,40,80]

Notes

  • Hide complexity of an array of objects from user with simpler array of integers.
cf app dora
Showing health and status for app dora in org org-1 / space space-1 as admin...
....
Active deployment with status PAUSED (since Tue 10 Dec 11:44:18 MST 2024)
strategy:        canary
max-in-flight:   1
canary-step:     1/5
@Samze Samze changed the title Allow multiple Canary steps Feature Request: Allow multiple Canary steps Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant