diff --git a/docs/integrations/engines/clickhouse.md b/docs/integrations/engines/clickhouse.md
index 941003ab8..b5d5eb55c 100644
--- a/docs/integrations/engines/clickhouse.md
+++ b/docs/integrations/engines/clickhouse.md
@@ -1,67 +1,67 @@
-# Clickhouse
+# ClickHouse
-This page describes SQLMesh support for the Clickhouse engine, including configuration options specific to Clickhouse.
+This page describes SQLMesh support for the ClickHouse engine, including configuration options specific to ClickHouse.
!!! note
- Clickhouse may not be used for the SQLMesh [state connection](../../reference/configuration.md#connections).
+ ClickHouse may not be used for the SQLMesh [state connection](../../reference/configuration.md#connections).
## Background
-[Clickhouse](https://clickhouse.com/) is a distributed, column-oriented SQL engine designed to rapidly execute analytical workloads.
+[ClickHouse](https://clickhouse.com/) is a distributed, column-oriented SQL engine designed to rapidly execute analytical workloads.
It provides users fine-grained control of its behavior, but that control comes at the cost of complex configuration.
-This section provides background information about Clickhouse, providing context for how to use SQLMesh with the Clickhouse engine.
+This section provides background information about ClickHouse, providing context for how to use SQLMesh with the ClickHouse engine.
### Object naming
Most SQL engines use a three-level hierarchical naming scheme: tables/views are nested within _schemas_, and schemas are nested within _catalogs_. For example, the full name of a table might be `my_catalog.my_schema.my_table`.
-Clickhouse instead uses a two-level hierarchical naming scheme that has no counterpart to _catalog_. In addition, it calls the second level in the hierarchy "databases." SQLMesh and its documentation refer to this second level as "schemas."
+ClickHouse instead uses a two-level hierarchical naming scheme that has no counterpart to _catalog_. In addition, it calls the second level in the hierarchy "databases." SQLMesh and its documentation refer to this second level as "schemas."
-SQLMesh fully supports Clickhouse's two-level naming scheme without user action.
+SQLMesh fully supports ClickHouse's two-level naming scheme without user action.
### Table engines
-Every Clickhouse table is created with a ["table engine" that controls how the table's data is stored and queried](https://clickhouse.com/docs/en/engines/table-engines). Clickhouse's (and SQLMesh's) default table engine is `MergeTree`.
+Every ClickHouse table is created with a ["table engine" that controls how the table's data is stored and queried](https://clickhouse.com/docs/en/engines/table-engines). ClickHouse's (and SQLMesh's) default table engine is `MergeTree`.
The [`MergeTree` engine family](https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/mergetree) requires that every table be created with an `ORDER BY` clause.
SQLMesh will automatically inject an empty `ORDER BY` clause into every `MergeTree` family table's `CREATE` statement, or you can specify the columns/expressions by which the table should be ordered.
-### Clickhouse modes of operation
+### ClickHouse modes of operation
-Conceptually, it may be helpful to view Clickhouse as having three modes of operation: single server, cluster, and Clickhouse Cloud. SQLMesh supports all three modes.
+Conceptually, it may be helpful to view ClickHouse as having three modes of operation: single server, cluster, and ClickHouse Cloud. SQLMesh supports all three modes.
#### Single server mode
-Single server mode is similar to other SQL engines: aside from choosing each table's engine, you do not need to worry about how computations are executed. You issue standard SQL commands/queries, and Clickhouse executes them.
+Single server mode is similar to other SQL engines: aside from choosing each table's engine, you do not need to worry about how computations are executed. You issue standard SQL commands/queries, and ClickHouse executes them.
#### Cluster mode
-Cluster mode allows you to scale your Clickhouse engine to any number of networked servers. This enables massive workloads, but requires that you specify how computations are executed by the networked servers.
+Cluster mode allows you to scale your ClickHouse engine to any number of networked servers. This enables massive workloads, but requires that you specify how computations are executed by the networked servers.
-Clickhouse coordinates the computations on the networked servers with [Clickhouse Keeper](https://clickhouse.com/docs/en/architecture/horizontal-scaling) (it also supports [Apache ZooKeeper](https://zookeeper.apache.org/)).
+ClickHouse coordinates the computations on the networked servers with [ClickHouse Keeper](https://clickhouse.com/docs/en/architecture/horizontal-scaling) (it also supports [Apache ZooKeeper](https://zookeeper.apache.org/)).
You specify named virtual clusters of servers in the Keeper configuration, and those clusters provide namespaces for data objects and computations. For example, you might include all networked servers in the cluster you name `MyCluster`.
-In general, you must be connected to a Clickhouse server to execute commands. By default, each command you execute runs in single-server mode on the server you are connected to.
+In general, you must be connected to a ClickHouse server to execute commands. By default, each command you execute runs in single-server mode on the server you are connected to.
To associate an object with a cluster, DDL commands that create or modify it must include the text `ON CLUSTER [your cluster name]`.
If you provide a cluster name in your SQLMesh connection configuration, SQLMesh will automatically inject the `ON CLUSTER` statement into the DDL commands for all objects created while executing the project. We provide more information about clusters in SQLMesh [below](#cluster-specification).
-#### Clickhouse Cloud mode
+#### ClickHouse Cloud mode
-[Clickhouse Cloud](https://clickhouse.com/cloud) is a managed Clickhouse platform. It allows you to scale Clickhouse without administering a cluster yourself or modifying your SQL commands to run on the cluster.
+[ClickHouse Cloud](https://clickhouse.com/cloud) is a managed ClickHouse platform. It allows you to scale ClickHouse without administering a cluster yourself or modifying your SQL commands to run on the cluster.
-Clickhouse Cloud automates Clickhouse's cluster controls, which sometimes constrains Clickhouse's flexibility or how you execute SQL commands. For example, creating a table with a `SELECT` command must [occur in two steps on Clickhouse Cloud](https://clickhouse.com/docs/en/sql-reference/statements/create/table#from-select-query). SQLMesh handles this limitation for you.
+ClickHouse Cloud automates ClickHouse's cluster controls, which sometimes constrains ClickHouse's flexibility or how you execute SQL commands. For example, creating a table with a `SELECT` command must [occur in two steps on ClickHouse Cloud](https://clickhouse.com/docs/en/sql-reference/statements/create/table#from-select-query). SQLMesh handles this limitation for you.
-Aside from those constraints, Clickhouse Cloud mode is similar to single server mode - you run standard SQL commands/queries, and Clickhouse Cloud executes them.
+Aside from those constraints, ClickHouse Cloud mode is similar to single server mode - you run standard SQL commands/queries, and ClickHouse Cloud executes them.
## Cluster specification
-A Clickhouse cluster allows multiple networked Clickhouse servers to operate on the same data object. Every cluster must be named in the Clickhouse configuration files, and that name is passed to a table's DDL statements in the `ON CLUSTER` clause.
+A ClickHouse cluster allows multiple networked ClickHouse servers to operate on the same data object. Every cluster must be named in the ClickHouse configuration files, and that name is passed to a table's DDL statements in the `ON CLUSTER` clause.
For example, we could create a table `my_schema.my_table` on cluster `TheCluster` like this: `CREATE TABLE my_schema.my_table ON CLUSTER TheCluster (col1 Int8)`.
@@ -71,7 +71,7 @@ SQLMesh will automatically inject the `ON CLUSTER` clause and cluster name you p
## Model definition
-This section describes how you control a table's engine and other Clickhouse-specific functionality in SQLMesh models.
+This section describes how you control a table's engine and other ClickHouse-specific functionality in SQLMesh models.
### Table engine
@@ -156,7 +156,7 @@ Note that there is an `=` between the `primary_key` key name and value `col1`.
### TTL
-Clickhouse tables accept a [TTL expression that triggers actions](https://clickhouse.com/docs/en/guides/developer/ttl) like deleting rows after a certain amount of time has passed.
+ClickHouse tables accept a [TTL expression that triggers actions](https://clickhouse.com/docs/en/guides/developer/ttl) like deleting rows after a certain amount of time has passed.
Similar to `ORDER_BY` and `PRIMARY_KEY`, specify a TTL key in the model DDL's `physical_properties` dictionary. For example:
@@ -180,7 +180,7 @@ Note that there is an `=` between the `ttl` key name and value `timestamp + INTE
### Partitioning
-Some Clickhouse table engines support partitioning. Specify the partitioning columns/expressions in the model DDL's `partitioned_by` key.
+Some ClickHouse table engines support partitioning. Specify the partitioning columns/expressions in the model DDL's `partitioned_by` key.
For example, you could partition by columns `col1` and `col2` like this:
@@ -200,13 +200,13 @@ Learn more below about how SQLMesh uses [partitioned tables to improve performan
## Settings
-Clickhouse supports an [immense number of settings](https://clickhouse.com/docs/en/operations/settings), many of which can be altered in multiple places: Clickhouse configuration files, Python client connection arguments, DDL statements, SQL queries, and others.
+ClickHouse supports an [immense number of settings](https://clickhouse.com/docs/en/operations/settings), many of which can be altered in multiple places: ClickHouse configuration files, Python client connection arguments, DDL statements, SQL queries, and others.
-This section discusses how to control Clickhouse settings in SQLMesh.
+This section discusses how to control ClickHouse settings in SQLMesh.
### Connection settings
-SQLMesh connects to Python with the [`clickhouse-connect` library](https://clickhouse.com/docs/en/integrations/python). Its connection method accepts a dictionary of arbitrary settings that are passed to Clickhouse.
+SQLMesh connects to Python with the [`clickhouse-connect` library](https://clickhouse.com/docs/en/integrations/python). Its connection method accepts a dictionary of arbitrary settings that are passed to ClickHouse.
Specify these settings in the `connection_settings` key. This example demonstrates how to set the `distributed_ddl_task_timeout` setting to `300`:
@@ -226,7 +226,7 @@ clickhouse_gateway:
### DDL settings
-Clickhouse settings may also be specified in DDL commands like `CREATE`.
+ClickHouse settings may also be specified in DDL commands like `CREATE`.
Specify these settings in a model DDL's [`physical_properties` key](https://sqlmesh.readthedocs.io/en/stable/concepts/models/overview/?h=physical#physical_properties) (where the [`order_by`](#order-by) and [`primary_key`](#primary-key) values are specified, if present).
@@ -250,7 +250,7 @@ Note that there is an `=` between the `index_granularity` key name and value `12
### Query settings
-Clickhouse settings may be specified directly in a model's query with the `SETTINGS` keyword.
+ClickHouse settings may be specified directly in a model's query with the `SETTINGS` keyword.
This example demonstrates setting the `join_use_nulls` setting to `1`:
@@ -270,7 +270,7 @@ Multiple settings may be specified in a query with repeated use of the `SETTINGS
#### Usage by SQLMesh
-The Clickhouse setting `join_use_nulls` affects the behavior of SQLMesh SCD models and table diffs. This section describes how SQLMesh uses query settings to control that behavior.
+The ClickHouse setting `join_use_nulls` affects the behavior of SQLMesh SCD models and table diffs. This section describes how SQLMesh uses query settings to control that behavior.
^^Background^^
@@ -280,7 +280,7 @@ For example, consider `LEFT JOIN`ing two tables `left` and `right`, where the co
In other SQL engines, those empty cells are filled with `NULL`s.
-In contrast, Clickhouse fills the empty cells with data type-specific default values (e.g., 0 for integer column types). It will instead fill the cells with `NULL`s if you set the `join_use_nulls` setting to `1`.
+In contrast, ClickHouse fills the empty cells with data type-specific default values (e.g., 0 for integer column types). It will instead fill the cells with `NULL`s if you set the `join_use_nulls` setting to `1`.
^^SQLMesh^^
@@ -288,7 +288,7 @@ SQLMesh automatically generates SQL queries for both SCD Type 2 models and table
Because those queries expect `NULL` values in empty cells, SQLMesh automatically adds `SETTINGS join_use_nulls = 1` to the generated SCD and table diff SQL code.
-The SCD model definition query is embedded as a CTE in the full SQLMesh-generated query. If run alone, the model definition query would use the Clickhouse server's current `join_use_nulls` value.
+The SCD model definition query is embedded as a CTE in the full SQLMesh-generated query. If run alone, the model definition query would use the ClickHouse server's current `join_use_nulls` value.
If that value is not `1`, the SQLMesh setting on the outer query would override the server value and produce incorrect results.
@@ -301,9 +301,9 @@ Therefore, SQLMesh uses the following procedure to ensure the model definition q
## Performance considerations
-Clickhouse is optimized for writing/reading records, so deleting/replacing records can be extremely slow.
+ClickHouse is optimized for writing/reading records, so deleting/replacing records can be extremely slow.
-This section describes why SQLMesh needs to delete/replace records and how the Clickhouse engine adapter works around the limitations.
+This section describes why SQLMesh needs to delete/replace records and how the ClickHouse engine adapter works around the limitations.
### Why delete or replace?
@@ -320,13 +320,13 @@ Some engines natively support updating or inserting ("upserting") records. For e
Other engines do not natively support upserts, so SQLMesh replaces records in two steps: delete the records to update/replace from the existing table, then insert the new records.
-Clickhouse does not support upserts, and it performs the two step delete/insert operation so slowly as to be unusable. Therefore, SQLMesh uses a different method for replacing records.
+ClickHouse does not support upserts, and it performs the two step delete/insert operation so slowly as to be unusable. Therefore, SQLMesh uses a different method for replacing records.
### Temp table swap
-SQLMesh uses what we call the "temp table swap" method of replacing records in Clickhouse.
+SQLMesh uses what we call the "temp table swap" method of replacing records in ClickHouse.
-Because Clickhouse is optimized for writing and reading records, it is often faster to copy most of a table than to delete a small portion of its records. That is the approach used by the temp table swap method (with optional performance improvements [for partitioned tables](#partition-swap)).
+Because ClickHouse is optimized for writing and reading records, it is often faster to copy most of a table than to delete a small portion of its records. That is the approach used by the temp table swap method (with optional performance improvements [for partitioned tables](#partition-swap)).
The temp table swap has four steps:
@@ -338,7 +338,7 @@ The temp table swap has four steps:
Figure 1 illustrates these four steps:
-![Clickhouse table swap steps](./clickhouse/clickhouse_table-swap-steps.png){ loading=lazy }
+![ClickHouse table swap steps](./clickhouse/clickhouse_table-swap-steps.png){ loading=lazy }
_Figure 1: steps to execute a temp table swap_
@@ -348,7 +348,7 @@ To address this weakness, SQLMesh instead uses *partition* swapping if a table i
### Partition swap
-Clickhouse supports *partitioned* tables, which store groups of records in separate files, or "partitions."
+ClickHouse supports *partitioned* tables, which store groups of records in separate files, or "partitions."
A table is partitioned based on a table column or SQL expression - the "partitioning key." All records with the same value for the partitioning key are stored together in a partition.
@@ -368,7 +368,7 @@ Too many partitions can drastically decrease performance because the overhead of
!!! question "How many partitions is too many?"
- Clickhouse's documentation [specifically warns against tables having too many partitions](https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/custom-partitioning-key), suggesting a maximum of 1000.
+ ClickHouse's documentation [specifically warns against tables having too many partitions](https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/custom-partitioning-key), suggesting a maximum of 1000.
The total number of partitions in a table is determined by the actual data in the table, not by the partition column/expression alone.
@@ -401,12 +401,12 @@ If a model has many records in each partition, you may see additional performanc
| `type` | Engine type name - must be `clickhouse` | string | Y |
| `user` | The username to log in to your server. | string | Y |
| `host` | The hostname of your server. Do not include the `http://` or `https://` prefix. | string | Y |
-| `port` | The port to connect to your server. Default: 8123 for non-encrypted connections, 8443 for encrypted connections and Clickhouse Cloud. | int | Y |
-| `cluster` | Name of the Clickhouse cluster on which SQLMesh should create objects. Should not be specified for standalone Clickhouse servers or Clickhouse Cloud. | string | N |
+| `port` | The port to connect to your server. Default: 8123 for non-encrypted connections, 8443 for encrypted connections and ClickHouse Cloud. | int | Y |
+| `cluster` | Name of the ClickHouse cluster on which SQLMesh should create objects. Should not be specified for standalone ClickHouse servers or ClickHouse Cloud. | string | N |
| `use_compression` | Enable compression for ClickHouse HTTP inserts and query results. Default: True | bool | N |
| `compression_method` | Use a specific compression method for inserts and query results - allowed values `lz4`, `zstd`, `br`, or `gzip`. | str | N |
| `query_limit` | Maximum number of rows to return for any query response. Default: 0 (unlimited rows) | int | N |
| `connect_timeout` | HTTP connection timeout in seconds. Default: 10 | int | N |
| `send_receive_timeout` | Send/receive timeout for the HTTP connection in seconds. Default: 300 | int | N |
| `verify` | Validate the ClickHouse server TLS/SSL certificate (hostname, expiration, etc.) if using HTTPS/TLS. Default: True | bool | N |
-| `connection_settings` | Arbitrary Clickhouse settings passed to the [`clickhouse-connect` client](https://clickhouse.com/docs/en/integrations/python#client-initialization). | dict[str, any] | N |
\ No newline at end of file
+| `connection_settings` | Arbitrary ClickHouse settings passed to the [`clickhouse-connect` client](https://clickhouse.com/docs/en/integrations/python#client-initialization). | dict[str, any] | N |
\ No newline at end of file