-
Notifications
You must be signed in to change notification settings - Fork 235
AutoMQ with Timeplus: Build Cost Efficient Stream Processing Platform
Timeplus is a data analytics platform that focuses on stream processing, utilizes the open-source streaming database Proton, and offers powerful end-to-end capabilities that enable teams to efficiently and intuitively manage both streaming and historical data. It is designed for organizations of various sizes and industries, equipping data engineers and platform engineers with the tools to fully exploit the value of streaming data using SQL.
This article outlines how to use the Timeplus console to import AutoMQ data into Timeplus. Since AutoMQ is fully compatible with Apache Kafka, Kafka external streams can also be established to analyze AutoQ data without relocating the data.
Refer to Stand-alone Deployment to deploy AutoMQ, ensuring network connectivity between AutoMQ and Timeplus.
To maintain an IP whitelist, you need to add the static IP of the Timeplus service to the whitelist:
52.83.159.13 for cloud.timeplus.com.cn
Quickly create a Topic named example_topic in AutoMQ and write a test JSON data into it, following these steps.
Use the Apache Kafka® command-line tool to create the topic, ensuring you have access to the Kafka environment and that the Kafka service is running. Here is an example command to create the topic:
./kafka-topics.sh --create --topic exampleto_topic --bootstrap-server 10.0.96.4:9092 --partitions 1 --replication-factor 1
When executing commands, replace "topic" and "bootstrap-server" with the actual addresses of your Kafka servers.
After creating the topic, you can use the following command to verify if the topic has been successfully created.
./kafka-topics.sh --describe example_topic --bootstrap-server 10.0.96.4:9092
Generate a piece of test data in JSON format, which should correspond to the table discussed earlier.
{
"id": 1,
"name": "测试用户",
"timestamp": "2023-11-10T12:00:00",
"status": "active"
}
Use Kafka's command line tools or a programming approach to write the test data into a topic named "example_topic". Here is an example using the command line tool:
echo '{"id": 1, "name": "测试用户", "timestamp": "2023-11-10T12:00:00", "status": "active"}' | sh kafka-console-producer.sh --broker-list 10.0.96.4:9092 --topic example_topic
To view the data just written to the topic, use the following command:
sh kafka-console-consumer.sh --bootstrap-server 10.0.96.4:9092 --topic example_topic --from-beginning
When executing commands, replace "topic" and "bootstrap-server" with the actual addresses of your Kafka servers.
-
In the left navigation menu, click "Data Ingestion" and then click the "Add Data" button in the upper right corner.
-
In the pop-up window, review the connectable data sources and other methods of adding data. Since AutoMQ is fully compatible with Apache Kafka®, directly select Apache Kafka®.
-
Enter the broker URL, disable TLS and authentication.
-
Input the AutoMQ topic name and choose the "Read as" data format, supporting JSON, AVRO, and text formats.
-
It is advisable to use Text format for saving the entire JSON document as a string, which simplifies the management of structural changes.
-
When opting for AVRO, turn on the "auto extract" feature to save top-level properties as individual columns. Remember to provide the schema registry address, API key, and secret.
-
-
During the "preview" phase, ensure at least one event is displayed; by default, new data sources initiate a new stream in Timeplus.
-
Assign a name to the stream, confirm column details, and you may set the event time column. If left unset, the system defaults to using the ingestion time. Existing streams can be selected as well.
-
After the preview, label the source, include a description, and finalize the settings. Click "Finish," and the stream data will be immediately accessible in the specified stream.
When utilizing an AutoMQ data source, adhere to these restrictions:
-
Currently, only messages in AutoMQ Kafka topics formatted in JSON and AVRO are supported.
-
Topic-level JSON attributes will be transformed into stream columns. For nested attributes, elements will be stored as String columns, which can then be queried using one of the JSON functions.
-
Numeric or Boolean types in JSON messages will be converted to corresponding types in the stream.
-
Date-time or timestamp will be stored as string columns. They can be converted back to DateTime via the to_time function.
- What is automq: Overview
- Difference with Apache Kafka
- Difference with WarpStream
- Difference with Tiered Storage
- Compatibility with Apache Kafka
- Licensing
- Deploy Locally
- Cluster Deployment on Linux
- Cluster Deployment on Kubernetes
- Example: Produce & Consume Message
- Example: Simple Benchmark
- Example: Partition Reassignment in Seconds
- Example: Self Balancing when Cluster Nodes Change
- Example: Continuous Data Self Balancing
-
S3stream shared streaming storage
-
Technical advantage
- Deployment: Overview
- Runs on Cloud
- Runs on CEPH
- Runs on CubeFS
- Runs on MinIO
- Runs on HDFS
- Configuration
-
Data analysis
-
Object storage
-
Kafka ui
-
Observability
-
Data integration