Skip to content

Considerations

Luca Terracciano edited this page Dec 12, 2020 · 3 revisions

Considerations

Here are grouped all the design thoughts we have collected during the system design.

Why a Custom VPA and HPA

A Custom Resource Definition is used to extend the Kubernetes API, enabling the creation of new components and behaviours inside the cluster. With this in mind, we want to develop a set of components to handle the in place resize of pod resources and the pod horizontal scaling based on user provided metrics. At the moment this paragraph is written, there are Kubernetes component capable of handling the resource scaling separately, in particular:

  • HPA: changes replicas number of a Kubernetes resource (Deployment/StatefulSet etc.)
  • VPA: changes pod requests and limits.
  • Cluster Autoscaler: based on cluster load, adds and removes worker nodes from the cluster.

We decided to keep only the third and ignore the first two autoscalers because:

  • Pod Replicas Updater: using the already existing HPA will add one more (useless) step to the system. We’d have to compute a custom metric in order to notify the existing HPA that it should add a new replica. We consider it an unnecessary step.
  • Pod Resource Updater: the current VPA does not allow in-place scaling and adds more logic during the recommendation. We would like to avoid it to make the autoscaler more reactive.

Why NodeSelectors

A necessary constraint used also in COCOS that should be enforced in the system to make these components work is the use of Kubernetes NodeSelector field to avoid some pod’s replicas running on the same node. This would cause race conditions on the node’s resources and we want to avoid it.

Why QOS Guaranteed

Actually this is the only title that might be misleading. The Guaranteed QOS class is assigned to pods having same requests and limits, which is required by the changes made by Vinaykul to implement the in place pod resource update. Furthermore, most of system components relies on the assumption that these two quantities coincide.

Clone this wiki locally