By default, KUDO Kafka resource configuration is set to the minimum recommended values for production usage. But users can and should tune the configurations based on the workload requirements of their Kafka cluster.
$ kubectl kudo install kafka --instance=my-kafka-name \
-p ZOOKEEPER_URI=zk-zookeeper-0.zk-hs:2181,zk-zookeeper-1.zk-hs:2181,zk-zookeeper-2.zk-hs:2181 \
-p ZOOKEEPER_PATH=/custom-path \
-p BROKER_CPUS=3000m \
-p BROKER_COUNT=5 \
-p BROKER_MEM=4096m \
-p DISK_SIZE=20Gi \
-p MIN_INSYNC_REPLICAS=3 \
-p NUM_IO_THREADS=10 \
-p NUM_NETWORK_THREADS=5
By default, the Kafka operator will use livenessProbe
of type tcpSocket
to check the broker port. This is a simple port based health check.
For using a more robust health check based on broker functionality you can set the parameter LIVENESS_METHOD
to FUNCTIONAL
.
This check is a producer-consumer check based on a custom heartbeat topic which you can set using the parameter LIVENESS_TOPIC_PREFIX
.
$ kubectl kudo install kafka --instance=my-kafka-name -p LIVENESS_METHOD=FUNCTIONAL -p LIVENESS_TOPIC_PREFIX=MyHealthCheckTopic
Health checks can be enabled when using Kerberos with KUDO Kafka.
When using FUNCTIONAL
method then additional principals needs to be created. Assuming livenessProbe
as the principal name the principals will be:
livenessProbe/my-kafka-name-kafka-0.kafka-svc.kudo-kafka.svc.cluster.local@LOCAL
livenessProbe/my-kafka-name-kafka-1.kafka-svc.kudo-kafka.svc.cluster.local@LOCAL
livenessProbe/my-kafka-name-kafka-2.kafka-svc.kudo-kafka.svc.cluster.local@LOCAL
You need to create one principal per broker for each individual livenessProbe.
By default, the Kafka operator will use the default storage class of the Kubernetes cluster.
To deploy Kafka using a different storage class, you can use the parameter STORAGE_CLASS
$ kubectl kudo install kafka --instance=my-kafka-name -p STORAGE_CLASS=<STORAGE_CLASS_NAME>
$ kubectl kudo install kafka --instance=my-kafka-name -p PERSISTENT_STORAGE=false
Having PERSISTENT_STORAGE
value false
means that any data or logs inside the brokers will be lost after a pod restart or rescheduling.
Deploying without persistent storage isn't recommended for production usage.
By default, the Kafka cluster will have the JMX Exporter enabled. You can check more information around how KUDO Kafka exposes metrics in monitoring.
KUDO Kafka requires a running ZooKeeper cluster to perform its own internal accounting and persist cluster topology. And connectivity to the Zookeeper. You can install the KUDO Zookeeper or you can use any other Zookeeper cluster you're running inside or outside kubernetes to use with KUDO Kafka.
You can configure KUDO Kafka to use Zookeeper using the parameter ZOOKEEPER_URI
Let's see this with an example:
$ kubectl kudo install kafka --instance=my-kafka-name \
-p ZOOKEEPER_URI=zk-zookeeper-0.zk-hs:2181,zk-zookeeper-1.zk-hs:2181,zk-zookeeper-2.zk-hs:2181 \
In the above example KUDO Kafka cluster will connect to the Zookeeper cluster available via following DNS names zk-zookeeper-0.zk-hs:2181,zk-zookeeper-1.zk-hs:2181,zk-zookeeper-2.zk-hs:2181
The Kafka cluster topology is persisted in a Zookeeper node. Every node in Zookeeper's namespace is identified by a path.
Two Kafka clusters can share the same Zookeeper but sharing the same path can lead to a corrupt state for both Kafka clusters. To avoid this to happen, KUDO Kafka persists its cluster topology in zk node with same name as KUDO Kafka instance.
Let's for example see what will the zk path look like in the above example where we just configured ZOOKEEPER_URI
:
$ kubectl kudo install kafka --instance=my-kafka-name \
-p ZOOKEEPER_URI=zk-zookeeper-0.zk-hs:2181,zk-zookeeper-1.zk-hs:2181,zk-zookeeper-2.zk-hs:2181 \
In the above example KUDO Kafka will connect the Zookeeper present in zk-zookeeper-0.zk-hs:2181,zk-zookeeper-1.zk-hs:2181,zk-zookeeper-2.zk-hs:2181
and
will create a Zookeeper node in path /my-kafka-cluster
.
You can also decide to chose a custom Zookeeper node path, using the KUDO Kafka parameter ZOOKEEPER_PATH
Let's see this with an example:
$ kubectl kudo install kafka --instance=my-kafka-name \
-p ZOOKEEPER_URI=zk-zookeeper-0.zk-hs:2181,zk-zookeeper-1.zk-hs:2181,zk-zookeeper-2.zk-hs:2181 \
-p ZOOKEEPER_PATH=/custom-path
In the above example KUDO Kafka will connect the Zookeeper present in zk-zookeeper-0.zk-hs:2181,zk-zookeeper-1.zk-hs:2181,zk-zookeeper-2.zk-hs:2181
and
will create a Zookeeper node in path /custom-path
.
The Dockerfile used to build the KUDO Kafka operator is hosted in the kudo-kafka-operator repo