Skip to content
lyx edited this page Jan 17, 2025 · 1 revision

Preface

MinIO is a high-performance, distributed object storage system that runs on standard hardware, offering exceptional cost-effectiveness and broad applicability. It's specifically designed for high-performance private cloud environments, utilizing a simple yet efficient architecture to deliver comprehensive object storage functionality while maintaining outstanding performance. MinIO demonstrates its robust adaptability and superiority in various fields, from traditional secondary storage, disaster recovery, and archiving, to emerging areas such as machine learning, big data, private cloud, and hybrid cloud.

Thanks to MinIO's full compatibility with the S3 API, you can deploy an AutoMQ cluster in a private data center to obtain a Kafka-compatible streaming system that offers better cost efficiency, extreme elasticity, and single-digit millisecond latency. This article will guide you on how to deploy an AutoMQ cluster on top of your MinIO in a private data center.

Prerequisites

  • A functional MinIO environment. If you do not have an available MinIO environment, you can refer to its official installation guide for setup.

  • Prepare 5 hosts for deploying the AutoMQ cluster. It is recommended to select Linux amd64 hosts with 2 cores and 16GB of RAM, and to prepare two virtual storage volumes. An example is as follows:

    Role
    IP
    Node ID
    System Volume
    Data Volume
    CONTROLLER
    192.168.0.1
    0
    EBS 20GB
    EBS 20GB
    CONTROLLER
    192.168.0.2
    1
    EBS 20GB
    EBS 20GB
    CONTROLLER
    192.168.0.3
    2
    EBS 20GB
    EBS 20GB
    BROKER
    192.168.0.4
    3
    EBS 20GB
    EBS 20GB
    BROKER
    192.168.0.5
    4
    EBS 20GB
    EBS 20GB

    Tips:

    • Ensure that these machines are on the same subnet and can communicate with each other.
    • In non-production environments, you can deploy only one Controller, which also serves as a Broker by default.
  • Download the latest stable binary package for installing AutoMQ from AutoMQ GitHub Releases.

  • Create two custom-named object storage buckets on Ceph: automq-data and automq-ops.

    1. You can configure the required AWS CLI Access Key and Secret Key by setting environment variables.
    
    export AWS_ACCESS_KEY_ID=X1J0E1EC3KZMQUZCVHED
    export AWS_SECRET_ACCESS_KEY=Hihmu8nIDN1F7wshByig0dwQ235a0WAeUvAEiWSD
    
    
    1. Use AWS CLI to create an S3 bucket.
    
    aws s3api create-bucket --bucket automq-data --endpoint=http://127.0.0.1:80
    aws s3api create-bucket --bucket automq-ops --endpoint=http://127.0.0.1:80
    
    

Install and Start the AutoMQ Cluster

Step 1: Generate S3 URL

AutoMQ provides the automq-kafka-admin.sh tool for quickly starting AutoMQ. Simply provide the S3 URL containing the required S3 endpoint and authentication information, and you can start AutoMQ with one click, without manually generating cluster IDs or performing storage formatting.


### Command Line Usage Example
bin/automq-kafka-admin.sh generate-s3-url \ 
--s3-access-key=xxx  \ 
--s3-secret-key=yyy \ 
--s3-region=cn-northwest-1  \ 
--s3-endpoint=s3.cn-northwest-1.amazonaws.com.cn \ 
--s3-data-bucket=automq-data \ 
--s3-ops-bucket=automq-ops

When using MinIO, you can use the following configuration to generate a specific S3URL.

Parameter Name
Default Value in This Example
Description
--s3-access-key
minioadmin
Environment Variable MINIO_ROOT_USER
--s3-secret-key
minio-secret-key-CHANGE-ME
Environment Variable MINIO_ROOT_PASSWORD
--s3-region
us-west-2
This parameter is not valid in MinIO, and can be set to any value, such as us-west-2
--s3-endpoint
http://10.1.0.240:9000
You can obtain the endpoint by running the command sudo systemctl status minio.service
--s3-data-bucket
automq-data
-
--s3-ops-bucket
automq-ops
-

Output Results

After executing this command, the following stages will be automatically processed:

  1. Probe the core features of S3 using the provided accessKey and secretKey to verify the compatibility between AutoMQ and S3.

  2. Generate the s3url based on the identity information and access point information.

  3. Obtain the startup command for AutoMQ using the s3url. In the command, replace --controller-list and --broker-list with the actual CONTROLLER and BROKER that need to be deployed.

An example of the execution result is as follows:


############ Ping S3 ########################

[ OK ] Write s3 object
[ OK ] Read s3 object
[ OK ] Delete s3 object
[ OK ] Write s3 object
[ OK ] Upload s3 multipart object
[ OK ] Read s3 multipart object
[ OK ] Delete s3 object
############ String of S3url ################

Your s3url is:

s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=xxx&s3-secret-key=yyy&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA


############ Usage of S3url ################
To start AutoMQ, generate the start commandline using s3url.
bin/automq-kafka-admin.sh generate-start-command \
--s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" \
--controller-list="192.168.0.1:9093;192.168.0.2:9093;192.168.0.3:9093"  \
--broker-list="192.168.0.4:9092;192.168.0.5:9092"

TIPS: Please replace the controller-list and broker-list with your actual IP addresses.

Step 2: Generate the Startup Command List

Replace --controller-list and --broker-list in the generated command from the previous step with your host information, specifically replacing them with the IP addresses of the 3 CONTROLLERS and 2 BROKERS mentioned in the environment preparation, using the default ports 9092 and 9093.


bin/automq-kafka-admin.sh generate-start-command \
--s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" \
--controller-list="192.168.0.1:9093;192.168.0.2:9093;192.168.0.3:9093"  \
--broker-list="192.168.0.4:9092;192.168.0.5:9092"

Parameter Description

Parameter Name
Required
Description
--s3-url
Yes
Generated by the bin/automq-kafka-admin.sh generate-s3-url command line tool, includes authentication, cluster ID, and other information.
--controller-list
Yes
At least one address is required, used as the IP and port list for the CONTROLLER hosts. The format is IP1:PORT1; IP2:PORT2; IP3:PORT3
--broker-list
Yes
At least one address is required, used as the IP and port list for the BROKER hosts. The format is IP1:PORT1; IP2:PORT2; IP3:PORT3
--controller-only-mode
No
Determines whether the CONTROLLER node only assumes the CONTROLLER role. Defaults to false, meaning the deployed CONTROLLER node also acts as a BROKER.

Output Result

After executing the command, a command for starting AutoMQ will be generated.


############ Start Commandline ##############
To start an AutoMQ Kafka server, please navigate to the directory where your AutoMQ tgz file is located and run the following command.

Before running the command, make sure that Java 17 is installed on your host. You can verify the Java version by executing 'java -version'.

bin/kafka-server-start.sh --s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" --override process.roles=broker,controller --override node.id=0 --override [email protected]:9093,[email protected]:9093,[email protected]:9093 --override listeners=PLAINTEXT://192.168.0.1:9092,CONTROLLER://192.168.0.1:9093 --override advertised.listeners=PLAINTEXT://192.168.0.1:9092

bin/kafka-server-start.sh --s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" --override process.roles=broker,controller --override node.id=1 --override [email protected]:9093,[email protected]:9093,[email protected]:9093 --override listeners=PLAINTEXT://192.168.0.2:9092,CONTROLLER://192.168.0.2:9093 --override advertised.listeners=PLAINTEXT://192.168.0.2:9092

bin/kafka-server-start.sh --s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" --override process.roles=broker,controller --override node.id=2 --override [email protected]:9093,[email protected]:9093,[email protected]:9093 --override listeners=PLAINTEXT://192.168.0.3:9092,CONTROLLER://192.168.0.3:9093 --override advertised.listeners=PLAINTEXT://192.168.0.3:9092

bin/kafka-server-start.sh --s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" --override process.roles=broker --override node.id=3 --override [email protected]:9093,[email protected]:9093,[email protected]:9093 --override listeners=PLAINTEXT://192.168.0.4:9092 --override advertised.listeners=PLAINTEXT://192.168.0.4:9092

bin/kafka-server-start.sh --s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" --override process.roles=broker --override node.id=4 --override [email protected]:9093,[email protected]:9093,[email protected]:9093 --override listeners=PLAINTEXT://192.168.0.5:9092 --override advertised.listeners=PLAINTEXT://192.168.0.5:9092


TIPS: Start controllers first and then the brokers.

node.id is automatically generated starting from 0.

Step 3: Start AutoMQ

To start the cluster, sequentially execute the command list from the previous step on the designated CONTROLLER or BROKER hosts. For example, to start the first CONTROLLER process on 192.168.0.1, execute the first command from the generated startup command list.


bin/kafka-server-start.sh --s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" --override process.roles=broker,controller --override node.id=0 --override [email protected]:9093,[email protected]:9093,[email protected]:9093 --override listeners=PLAINTEXT://192.168.0.1:9092,CONTROLLER://192.168.0.1:9093 --override advertised.listeners=PLAINTEXT://192.168.0.1:9092

Parameter Description

When using the startup command, unspecified parameters will adopt Apache Kafka's default configurations. For parameters newly added by AutoMQ, AutoMQ's default values will be used. To override the default configurations, you can add additional --override key=value parameters at the end of the command.

Parameter Name
Required
Description
s3-url
Yes
Generated by the bin/automq-kafka-admin.sh generate-s3-url command line tool, containing authentication, cluster ID, and other information
process.roles
Yes
Options are CONTROLLER or BROKER. If a host serves as both CONTROLLER and BROKER, the configuration value should be CONTROLLER,BROKER.
node.id
Yes
An integer used to uniquely identify a BROKER or CONTROLLER within the Kafka cluster. It must be unique within the cluster.
controller.quorum.voters
Yes
Information of hosts participating in KRAFT elections, including nodeid, ip, and port information. For example: [email protected]:9093, [email protected]:9093, [email protected]:9093
listeners
Yes
IP and port being listened to
advertised.listeners
Yes
Access addresses provided by BROKER for Clients.
log.dirs
No
Directory storing KRAFT and BROKER metadata.
s3.wal.path
No
In a production environment, it is recommended to store AutoMQ WAL data on a newly mounted bare device for better performance. AutoMQ supports writing data to bare devices, reducing latency. Ensure the correct path is configured to store WAL data.
autobalancer.controller.enable
No
Default value is false, which disables traffic self-balancing. When enabled, AutoMQ's auto balancer component will automatically reassign partitions to ensure overall traffic is balanced.

Tips: To enable continuous traffic self-balancing or run Example: Self-Balancing When Cluster Nodes Change, it is recommended to explicitly specify the parameter --override autobalancer.controller.enable=true when starting the Controller.

Background Running

If you need to run in the background mode, please add the following code at the end of the command:


command > /dev/null 2>&1 &

Data Volume Path

You can view the local data volume using the lsblk command in Linux. The unpartitioned block device is the data volume. In the following example, vdb is the unpartitioned raw block device.


vda    253:0    0   20G  0 disk
├─vda1 253:1    0    2M  0 part
├─vda2 253:2    0  200M  0 part /boot/efi
└─vda3 253:3    0 19.8G  0 part /
vdb    253:16   0   20G  0 disk

By default, AutoMQ stores metadata and WAL data in the /tmp directory. However, it's important to note that if the /tmp directory is mounted on tmpfs, it is not suitable for a production environment.

For a more suitable production or formal testing environment, it is recommended to modify the configuration as follows: set the metadata directory log.dirs and the WAL data directory s3.wal.path (the raw block device for the write data disk) to other locations.


bin/kafka-server-start.sh ...\
--override  s3.telemetry.metrics.exporter.type=prometheus \
--override  s3.metrics.exporter.prom.host=0.0.0.0 \
--override  s3.metrics.exporter.prom.port=9090 \
--override  log.dirs=/root/kraft-logs \
--override  s3.wal.path=/dev/vdb \
> /dev/null 2>&1 &

Tips:

  • Please change s3.wal.path to the actual local raw device name. To set AutoMQ's Write-Ahead-Log (WAL) to local SSD storage, you need to ensure that the specified file path is on an SSD with more than 10GB of available space. For example, --override s3.wal.path=/home/admin/automq-wal.

  • When deploying AutoMQ in a private data center for production, ensure the reliability of the local SSD, such as using RAID technology.

At this point, you have completed the AutoMQ cluster deployment based on MinIO, and you have a low-cost, low-latency, second-level elastic Kafka cluster. If you want to further experience AutoMQ's second-level partition reassignment and continuous self-balancing features, you can refer to the official example.

AutoMQ Wiki Key Pages

What is automq

Getting started

Architecture

Deployment

Migration

Observability

Integrations

Releases

Benchmarks

Reference

Articles

Clone this wiki locally