Skip to content
This repository has been archived by the owner on May 7, 2024. It is now read-only.

Auto-start of #1

Open
deitch opened this issue Jul 25, 2017 · 8 comments
Open

Auto-start of #1

deitch opened this issue Jul 25, 2017 · 8 comments

Comments

@deitch
Copy link

deitch commented Jul 25, 2017

This is a nice clean example of redis cluster running in k8s. The one challenge is the cluster initialization and adding/removing nodes.

Is there any clean self-managing (i.e. autonomous) way to do it? Since you are using a StatefulSet, you know the names of the (initial) pods will be redis-cluster-0, redis-cluster-1, etc. You probably even could do 2 StatefulSets if you wanted guarantees as to which pods are master vs slave.

Is there no way to have redis-cluster-0 automatically come up and initialize a cluster with cluster-1 and cluster-2, or for that matter, just itself, and have cluster-1 and cluster-2 self-register to cluster-0? Same for expansion.

In a kube environment, having to do kubectl exec ... is not optimal (or recommended).

I am kind of surprised that redis doesn't have any option in the config file for saying, "here are your initial cluster peers".

You kind of wish an InitContainer could do this, but they complete before the pod containers run, so that will not help. Perhaps some extension of the image, so it spins up a child process? Or a sidecar container (although that would have a one-off and shouldn't be restarted, whereas restart policies are pod-level, not container-level)? Or a post-start lifecycle hook?

@sanderploegsma
Copy link
Owner

That is a good point. Indeed, redis cluster management is a bit tedious. If a self-managing way would have existed, I don't think you would even need a StatefulSet...

So if I understand you correctly you want to do something of the following. (let's assume a 1:1 master-slave ratio):

if podNr == 0 {
  set up new cluster
} else if podNr % 2 == 0 {
  join cluster
} else {
  join cluster as slave
}

I like the idea, but it has some complications:

  • We need to balance the keys across masters. This can be done when a new master is added, but I'm not sure how to automate balancing if a pod is removed (e.g. scaling down)
  • As I said, this assumes a specific master-slave ratio. The downside is that you cannot easily change this after deploying, you'd have to start from scratch or make the pods much more complex.

That being said, I wrote this how-to with the assumption that you do not scale your redis cluster up and down multiple times a day. Compared to spinning up new VMs and configuring them by hand, having to perform a handful of copy-and-paste commands to scale up your cluster whenever you need seems like a fair compromise, don't you think?

@deitch
Copy link
Author

deitch commented Jul 26, 2017

redis cluster management is a bit tedious

Yeah. I had not run redis in k8s before (lots of kube, lots of redis, never together), so I had never gone down the full automation path. The assumption that a human will, at least once, manually create the cluster and add/remove nodes is not exactly cloud-friendly.

So if I understand you correctly you want to do something of the following. (let's assume a 1:1 master-slave ratio):

Actually, I would like something even a step beyond. FYI, this is how I automate etcd setup for kube (and zookeeper for Kafka).

findAllNodesInMyCluster()
if (no other nodes) {
  iAmFirstMaster_Initialize()
} else {
  findOtherMasters_Join()
}

We can weave master-vs-slave logic in here, or have a separate StatefulSet for slaves.

For the complications:

  • "We need to balance the keys across masters" - well, the whole removal automation thing is non-trivial. I would be more than happy if we can start with a number of masters, or even a fixed number, as long as no human intervention is required.
    (
  • "As I said, this assumes a specific master-slave ratio." Agreed. For now, though, I am happy with fixed ratios, even fixed numbers of masters and slaves, then can improve.

From my perspective, the biggest stumbling block to getting it fully automated is the initial cluster setup and join.

Compared to spinning up new VMs and configuring them by hand, having to perform a handful of copy-and-paste commands to scale up your cluster whenever you need seems like a fair compromise, don't you think?

As long as I can do them via CI (kube config files) and not logging in, sure.

I think I am going to fork your repo and make some suggestions. Good idea? You have MIT license, so I figure you don't mind?

@sanderploegsma
Copy link
Owner

I don't have much time to look into your comment now, sorry. But:

I think I am going to fork your repo and make some suggestions. Good idea? You have MIT license, so I figure you don't mind?

Of course! That's why it's open source after all. I'm curious what you'll come up with 😉

@deitch
Copy link
Author

deitch commented Aug 1, 2017

Ugh, I am coming up against all sorts of limitations. See my SO question here

basically:

  • When a master goes down and comes back, unless there is external persistent storage volume, the master will not rejoin the cluster, but the other nodes won't know it. Cluster is broken.
  • When a master goes down and comes back, unless there is external persistent storage volume, the master comes back with no data, so the slave loses all data too
  • In a completely new startup, if A comes online before B (which can happen), A will start a cluster without B, but fail trying for it

The last one is solvable with timeouts (wait for all other nodes), but that is fragile

There is a "right" solution to this, which is the Kafka model (or consul's but with sharding):

  • Each node starts up in cluster mode
  • Each node is told the name of at least one peer, connects to it, and joins the cluster, even an existing one
  • Each node detects the existence or failure of all others and adjusts accordingly (resharding)
  • Each node contains its own shard data and replicas for some other nodes

Essentially, there only are masters, they are self-adjusting and self-healing and self-joining. But Redis is built in a different way entirely. Sigh.

@sanderploegsma
Copy link
Owner

Right, that's kind of what I also encountered. Essentially, all of your points are the main reason I used StatefulSets in the first place. I admire your courage in trying to find an all-automated solution though 😉 .

@deitch
Copy link
Author

deitch commented Aug 1, 2017

Yeah but even statefulsets don't solve it. They make the hostname consistent and if you map external persistent volumes (I prefer not to) consistently mounted, but the fundamental protocol and cluster problems continue.

Basically, redis is a great KV tech that was built pre cloud.

@KeithTt
Copy link

KeithTt commented Apr 18, 2018

@sanderploegsma Does this project support k8s 1.8+?

@sanderploegsma
Copy link
Owner

@KeithTt not sure why you're hijacking the thread, but yeah, it should.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants