You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If the controller crashes between creating the parent zk node and the child zk node, after the restart, the controller will not be able to create the child zk node again because reconcile_zk_node returns when encountering any error when creating the parent node.
A similar bug is found in the Pravega operator: if the controller crashes before creating the zk node, and meanwhile the developer updates the .spec.replicas, after the restart the controller won't be able to update the stateful set because the zk node doesn't exist. The bug is reported by us and the issue id is 569.
Note that #217 also fixes another two safety bugs (as documented in the PR) though we don't have machine-checkable proof for the safety property for now.
If the user changes the .spec.labels or .spec.annotations in the cr spec, the controller will also update the .spec.selector and .spec.volume_claim_templates of the stateful set. The two fields are immutable so such an update will fail and the fields that are mutable and supposed to be updated will never be updated.
RabbitMQ controller bug 1
It is a safety bug found when developing #211 and fixed in the same PR.
Downscale could happen if we only rely on the validation rule: The user deletes the current deployment and creates a new one with fewer .spec.replicas, which doesn't violate the validation rule. The stateful set created from the new cr may not have been deleted by the garbage collector when the controller tries to update the stateful set with .spec.replicas from the new cr which is smaller than the current replicas of the stateful set, which cause a downscale.
Previously, given a rabbitmq cr foo, the controller will create a client service called foo and a headless service called foo-nodes. The problem happens when the user creates another rabbitmq cr called foo-nodes whose client service will be named foo-nodes, colliding with the headless service of the first one. The reconcile core for the two rabbitmq crs will compete on the same service for the two crs, leading to oscillation problems and liveness violations.
The same bug also exists in the official rabbitmq operator. It is reported by us and the issue id is 1464.
The text was updated successfully, but these errors were encountered:
ZooKeeper controller bug 1
It is a liveness bug fixed by #217.
If the controller crashes between creating the parent zk node and the child zk node, after the restart, the controller will not be able to create the child zk node again because
reconcile_zk_node
returns when encountering any error when creating the parent node.A similar bug is found in the Pravega operator: if the controller crashes before creating the zk node, and meanwhile the developer updates the
.spec.replicas
, after the restart the controller won't be able to update the stateful set because the zk node doesn't exist. The bug is reported by us and the issue id is 569.Note that #217 also fixes another two safety bugs (as documented in the PR) though we don't have machine-checkable proof for the safety property for now.
ZooKeeper controller bug 2
It is a liveness bug fixed by #282.
If the user changes the
.spec.labels
or.spec.annotations
in the cr spec, the controller will also update the.spec.selector
and.spec.volume_claim_templates
of the stateful set. The two fields are immutable so such an update will fail and the fields that are mutable and supposed to be updated will never be updated.RabbitMQ controller bug 1
It is a safety bug found when developing #211 and fixed in the same PR.
Downscale could happen if we only rely on the validation rule: The user deletes the current deployment and creates a new one with fewer
.spec.replicas
, which doesn't violate the validation rule. The stateful set created from the new cr may not have been deleted by the garbage collector when the controller tries to update the stateful set with.spec.replicas
from the new cr which is smaller than the current replicas of the stateful set, which cause a downscale.RabbitMQ controller bug 2
It is a liveness bug fixed by #335.
Previously, given a rabbitmq cr
foo
, the controller will create a client service calledfoo
and a headless service calledfoo-nodes
. The problem happens when the user creates another rabbitmq cr calledfoo-nodes
whose client service will be namedfoo-nodes
, colliding with the headless service of the first one. The reconcile core for the two rabbitmq crs will compete on the same service for the two crs, leading to oscillation problems and liveness violations.The same bug also exists in the official rabbitmq operator. It is reported by us and the issue id is 1464.
The text was updated successfully, but these errors were encountered: