Skip to content

Commit

Permalink
Add Clickhouse integration docs (#1775)
Browse files Browse the repository at this point in the history
  • Loading branch information
danthelion authored Nov 14, 2024
1 parent 9e9899f commit f8c1d8d
Show file tree
Hide file tree
Showing 9 changed files with 73 additions and 17 deletions.
10 changes: 5 additions & 5 deletions site/docs/guides/dekaf_reading_collections_from_kafka.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,8 @@ walk you through the steps to connect to Estuary Flow using Dekaf and its schema

To connect to Estuary Flow via Dekaf, you need the following connection details:

- **Broker Address**: `dekaf.estuary.dev`
- **Schema Registry Address**: `https://dekaf.estuary.dev`
- **Broker Address**: `dekaf.estuary-data.com`
- **Schema Registry Address**: `https://dekaf.estuary-data.com`
- **Security Protocol**: `SASL_SSL`
- **SASL Mechanism**: `PLAIN`
- **SASL Username**: `{}`
Expand All @@ -57,7 +57,7 @@ from kafka import KafkaConsumer

# Configuration details
conf = {
'bootstrap_servers': 'dekaf.estuary.dev:9092',
'bootstrap_servers': 'dekaf.estuary-data.com:9092',
'security_protocol': 'SASL_SSL',
'sasl_mechanism': 'PLAIN',
'sasl_plain_username': '{}',
Expand Down Expand Up @@ -100,10 +100,10 @@ kcat -C \
-X sasl.mechanism=PLAIN \
-X sasl.username="{}" \
-X sasl.password="Your_Estuary_Refresh_Token" \
-b dekaf.estuary.dev:9092 \
-b dekaf.estuary-data.com:9092 \
-t "full/nameof/estuarycolletion" \
-p 0 \
-o beginning \
-s avro \
-r https://{}:{Your_Estuary_Refresh_Token}@dekaf.estuary.dev
-r https://{}:{Your_Estuary_Refresh_Token}@dekaf.estuary-data.com
```
3 changes: 2 additions & 1 deletion site/docs/reference/Connectors/dekaf/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,5 @@ functionality enables integrations with the Kafka ecosystem.
- [StarTree](/reference/Connectors/dekaf/dekaf-startree)
- [SingleStore](/reference/Connectors/dekaf/dekaf-singlestore)
- [Imply](/reference/Connectors/dekaf/dekaf-imply)
- [Bytewax](/reference/Connectors/dekaf/dekaf-bytewax)
- [Bytewax](/reference/Connectors/dekaf/dekaf-bytewax)
- [Clickhouse](/reference/Connectors/dekaf/dekaf-clickhouse)
2 changes: 1 addition & 1 deletion site/docs/reference/Connectors/dekaf/dekaf-bytewax.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ high-throughput, low-latency data processing tasks.
from bytewax.window import TumblingWindowConfig, SystemClockConfig

# Estuary Flow Dekaf configuration
KAFKA_BOOTSTRAP_SERVERS = "dekaf.estuary.dev:9092"
KAFKA_BOOTSTRAP_SERVERS = "dekaf.estuary-data.com:9092"
KAFKA_TOPIC = "/full/nameof/your/collection"

# Parse incoming messages
Expand Down
55 changes: 55 additions & 0 deletions site/docs/reference/Connectors/dekaf/dekaf-clickhouse.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Integrating ClickHouse Cloud with Estuary Flow via Dekaf

## Overview

This guide covers how to integrate ClickHouse Cloud with Estuary Flow using Dekaf, Estuary’s Kafka API compatibility
layer, and ClickPipes for real-time analytics. This integration allows ClickHouse Cloud users to stream data from a vast
array of sources supported by Estuary Flow directly into ClickHouse, using Dekaf for Kafka compatibility.

## Prerequisites

- **[ClickHouse Cloud](https://clickhouse.com/) account** with permissions to configure ClickPipes for data ingestion.
- **[Estuary Flow account](https://dashboard.estuary.dev/register)** with access to Dekaf and necessary connectors (
e.g., Salesforce, databases).
- **Estuary Flow Refresh Token** to authenticate with Dekaf.

---

## Step 1: Configure Data Source in Estuary Flow

1. **Generate a [Refresh Token](Estuary Refresh Token ([Generate a refresh token](/guides/how_to_generate_refresh_token))**:
- To access the Kafka-compatible topics, create a refresh token in the Estuary Flow dashboard. This token will act
as the password for both the broker and schema registry.

2. **Connect to Dekaf**:
- Estuary Flow will automatically expose your collections as Kafka-compatible topics through Dekaf. No additional
configuration is required.
- Dekaf provides the following connection details:

```
Broker Address: dekaf.estuary-data.com:9092
Schema Registry Address: https://dekaf.estuary-data.com
Security Protocol: SASL_SSL
SASL Mechanism: PLAIN
SASL Username: {}
SASL Password: <Estuary Refresh Token>
Schema Registry Username: {}
Schema Registry Password: <Estuary Refresh Token>
```

---

## Step 2: Configure ClickPipes in ClickHouse Cloud

1. **Set Up ClickPipes**:
- In ClickHouse Cloud, go to **Integrations** and select **Apache Kafka** as the data source.

2. **Enter Connection Details**:
- Use the connection parameters from the previous step to configure access to Estuary Flow.

3. **Map Data Fields**:
- Ensure that ClickHouse can parse the incoming data properly. Use ClickHouse’s mapping interface to align fields
between Estuary Flow collections and ClickHouse tables.

4. **Provision the ClickPipe**:
- Kick off the integration and allow ClickPipes to set up the pipeline (should complete within a few seconds).
4 changes: 2 additions & 2 deletions site/docs/reference/Connectors/dekaf/dekaf-imply.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Druid, designed for real-time analytics on streaming and batch data.

5. In the Kafka configuration section, enter the following details:

- **Bootstrap Servers**: `dekaf.estuary.dev:9092`
- **Bootstrap Servers**: `dekaf.estuary-data.com:9092`
- **Topic**: Your Estuary Flow collection name (e.g., `/my-organization/my-collection`)
- **Security Protocol**: `SASL_SSL`
- **SASL Mechanism**: `PLAIN`
Expand All @@ -28,7 +28,7 @@ Druid, designed for real-time analytics on streaming and batch data.
6. For the "Input Format", select "avro".

7. Configure the Schema Registry settings:
- **Schema Registry URL**: `https://dekaf.estuary.dev`
- **Schema Registry URL**: `https://dekaf.estuary-data.com`
- **Schema Registry Username**: `{}` (same as SASL Username)
- **Schema Registry Password**: `The same Estuary Access Token as above`

Expand Down
4 changes: 2 additions & 2 deletions site/docs/reference/Connectors/dekaf/dekaf-materialize.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ for defining transformations and queries.

CREATE
CONNECTION estuary_connection TO KAFKA (
BROKER 'dekaf.estuary.dev',
BROKER 'dekaf.estuary-data.com',
SECURITY PROTOCOL = 'SASL_SSL',
SASL MECHANISMS = 'PLAIN',
SASL USERNAME = '{}',
Expand All @@ -29,7 +29,7 @@ for defining transformations and queries.

CREATE
CONNECTION csr_estuary_connection TO CONFLUENT SCHEMA REGISTRY (
URL 'https://dekaf.estuary.dev',
URL 'https://dekaf.estuary-data.com',
USERNAME = '{}',
PASSWORD = SECRET estuary_refresh_token
);
Expand Down
4 changes: 2 additions & 2 deletions site/docs/reference/Connectors/dekaf/dekaf-singlestore.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ offering high performance for both transactional and analytical workloads.
CREATE TABLE test_table (id NUMERIC, server_name VARCHAR(255), title VARCHAR(255));

CREATE PIPELINE test AS
LOAD DATA KAFKA "dekaf.estuary.dev:9092/demo/wikipedia/recentchange-sampled"
LOAD DATA KAFKA "dekaf.estuary-data.com:9092/demo/wikipedia/recentchange-sampled"
CONFIG '{
"security.protocol":"SASL_SSL",
"sasl.mechanism":"PLAIN",
Expand All @@ -34,7 +34,7 @@ offering high performance for both transactional and analytical workloads.
"schema.registry.password": "ESTUARY_ACCESS_TOKEN"
}'
INTO table test_table
FORMAT AVRO SCHEMA REGISTRY 'https://dekaf.estuary.dev'
FORMAT AVRO SCHEMA REGISTRY 'https://dekaf.estuary-data.com'
( id <- id, server_name <- server_name, title <- title );
```
4. Your pipeline should now start ingesting data from Estuary Flow into SingleStore.
4 changes: 2 additions & 2 deletions site/docs/reference/Connectors/dekaf/dekaf-startree.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,15 +18,15 @@ low-latency analytics on large-scale data.

![Create StarTree Connection](https://storage.googleapis.com/estuary-marketing-strapi-uploads/uploads//startree_create_connection_548379d134/startree_create_connection_548379d134.png)

- **Bootstrap Servers**: `dekaf.estuary.dev`
- **Bootstrap Servers**: `dekaf.estuary-data.com`
- **Security Protocol**: `SASL_SSL`
- **SASL Mechanism**: `PLAIN`
- **SASL Username**: `{}`
- **SASL Password**: `Your generated Estuary Refresh Token`

5. **Configure Schema Registry**: To decode Avro messages, enable schema registry settings:

- **Schema Registry URL**: `https://dekaf.estuary.dev`
- **Schema Registry URL**: `https://dekaf.estuary-data.com`
- **Schema Registry Username**: `{}` (same as SASL Username)
- **Schema Registry Password**: `The same Estuary Refresh Token as above`

Expand Down
4 changes: 2 additions & 2 deletions site/docs/reference/Connectors/dekaf/dekaf-tinybird.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,14 @@ In this guide, you'll learn how to use Estuary Flow to push data streams to Tiny

To configure the connection details, use the following settings.

Bootstrap servers: `dekaf.estuary.dev`
Bootstrap servers: `dekaf.estuary-data.com`
SASL Mechanism: `PLAIN`
SASL Username: `{}`
SASL Password: `Estuary Refresh Token` (Generate your token in the Estuary Admin Dashboard)

Tick the Decode Avro messages with Schema Register box, and use the following settings:

- URL: `https://dekaf.estuary.dev`
- URL: `https://dekaf.estuary-data.com`
- Username: `{}`
- Password: `The same Estuary Refresh Token as above`

Expand Down

0 comments on commit f8c1d8d

Please sign in to comment.