Configuring ClickHouse Keeper
ClickHouse Keeper is a built-in distributed coordination service for ClickHouse, responsible for replication and distributed DDL query execution. Keeper is fully protocol-compatible with ZooKeeper but is implemented in C++ and uses the RAFT consensus algorithm, ensuring linearizable writes and more predictable behavior during failures.
Separate configuration of ClickHouse Keeper is only required in cluster scenarios: when you have multiple ClickHouse nodes and are using replicated tables (ReplicatedMergeTree) or ON CLUSTER distributed queries. In a single-server installation, Keeper is not mandatory and can be omitted.
Keeper Deployment Options
ClickHouse Keeper can operate in two modes:
-
As a standalone service. It is installed via the
clickhouse-keeperand runs as a separate process with its own configuration file:- main configuration file:
/etc/clickhouse-keeper/keeper_config.xml - additional files:
/etc/clickhouse-keeper/keeper_config.d/*.xmlor*.yaml
- main configuration file:
-
As part of the
clickhouse-serverprocess. In this case, the<keeper_server>configuration block is added to the main server configuration:- main file:
/etc/clickhouse-server/config.xml - or a separate file in
/etc/clickhouse-server/config.d/keeper.xml
- main file:
The recommended approach for production is to use separate ClickHouse Keeper nodes and separate configuration files in /etc/clickhouse-keeper/ to independently scale and maintain the coordination cluster.
Basic ClickHouse Keeper Configuration Structure
The main configuration block for Keeper is the <keeper_server>. element. Typically, it includes:
Keeper Configuration Block
<clickhouse>
<logger>
<level>trace</level>
<log>/var/log/clickhouse-keeper/clickhouse-keeper.log</log>
<errorlog>/var/log/clickhouse-keeper/clickhouse-keeper.err.log</errorlog>
<size>1000M</size>
<count>10</count>
</logger>
<max_connections>4096</max_connections>
<listen_host>0.0.0.0</listen_host>
<keeper_server>
<!-- Port for client (ClickHouse servers or applications) connections to Keeper -->
<tcp_port>9181</tcp_port>
<!-- Unique identifier of the Keeper node in the cluster -->
<server_id>1</server_id>
<log_storage_path>/var/lib/clickhouse/coordination/logs</log_storage_path>
<snapshot_storage_path>/var/lib/clickhouse/coordination/snapshots</snapshot_storage_path>
<!-- Internal coordination settings -->
<coordination_settings>
<operation_timeout_ms>10000</operation_timeout_ms>
<min_session_timeout_ms>10000</min_session_timeout_ms>
<session_timeout_ms>100000</session_timeout_ms>
<raft_logs_level>information</raft_logs_level>
<compress_logs>false</compress_logs>
</coordination_settings>
<!-- enable sanity hostname checks for cluster configuration (e.g. if localhost is used with remote endpoints) -->
<hostname_checks_enabled>true</hostname_checks_enabled>
<!-- Description of all Keeper nodes participating in the quorum -->
<raft_configuration>
<server>
<id>1</id>
<!-- Internal port and hostname -->
<hostname>ch-keeper-01</hostname>
<port>9234</port>
</server>
<server>
<id>2</id>
<!-- Internal port and hostname -->
<hostname>ch-keeper-02</hostname>
<port>9234</port>
</server>
<server>
<id>3</id>
<!-- Internal port and hostname -->
<hostname>ch-keeper-03</hostname>
<port>9234</port>
</server>
<!-- Add more servers here -->
</raft_configuration>
</keeper_server>
<openSSL>
<server>
<!-- Used for secure tcp port -->
<!-- openssl req -subj "/CN=localhost" -new -newkey rsa:2048 -days 365 -nodes -x509 -keyout /etc/clickhouse-server/server.key -out /etc/clickhouse-server/server.crt -->
<!-- <certificateFile>/etc/clickhouse-keeper/server.crt</certificateFile> -->
<!-- <privateKeyFile>/etc/clickhouse-keeper/server.key</privateKeyFile> -->
<!-- dhparams are optional. You can delete the <dhParamsFile> element.
To generate dhparams, use the following command:
openssl dhparam -out /etc/clickhouse-keeper/dhparam.pem 4096
Only file format with BEGIN DH PARAMETERS is supported.
-->
<!-- <dhParamsFile>/etc/clickhouse-keeper/dhparam.pem</dhParamsFile> -->
<verificationMode>none</verificationMode>
<loadDefaultCAFile>true</loadDefaultCAFile>
<cacheSessions>true</cacheSessions>
<disableProtocols>sslv2,sslv3</disableProtocols>
<preferServerCiphers>true</preferServerCiphers>
</server>
</openSSL>
</clickhouse>
Key parameters:
tcp_port— port for client connections (ClickHouse servers,clickhouse-keeper-client, utilities). Recommended value:9181(to avoid conflict with the standard ZooKeeper port2181)server_id— unique numeric identifier for the Keeper node. Values must be unique. A simple sequence like1, 2, 3, ...is recommendedlog_storage_path— path for RAFT coordination logssnapshot_storage_path— directory for state snapshots (compressed znode tree state)coordination_settings— detailed configuration for timeouts, heartbeat frequency, snapshot, and log parameters. In most cases, the basic values from the example are sufficientraft_configuration— description of all participants in the RAFT quorum
The Keeper cluster should consist of an odd number of nodes (typically 3 or 5). This is necessary to achieve quorum and improve fault tolerance.
Configuring the RAFT Quorum (<raft_configuration>)
The <raft_configuration> block describes all Keeper nodes participating in the RAFT quorum:
<raft_configuration>
<secure>false</secure>
<server>
<id>1</id>
<hostname>ch-keeper-01</hostname>
<port>9234</port>
</server>
<server>
<id>2</id>
<hostname>ch-keeper-02</hostname>
<port>9234</port>
</server>
<server>
<id>3</id>
<hostname>ch-keeper-03</hostname>
<port>9234</port>
</server>
</raft_configuration>
For each <server>, the following is defined:
id— the server's identifier within the RAFT quorum. This must match theserver_idin the<keeper_server>block on the corresponding nodehostname— the hostname by which other nodes can contact this Keeper. Using DNS names rather than IP addresses is recommended to maintain a stableserver_id ↔ hostnamemappingport— the port for internal communication between Keeper nodes (inter-server RAFT port). This is different from thetcp_portused by clients
When replacing or migrating a Keeper node, it is crucial not to reuse an old server_id for a different physical server and not to "mix up" the server_id ↔ hostname mapping. This is critical for the correctness of the RAFT quorum.
Internal Coordination Settings
The <coordination_settings> block manages timeouts and RAFT operational parameters:
<coordination_settings>
<operation_timeout_ms>10000</operation_timeout_ms>
<min_session_timeout_ms>10000</min_session_timeout_ms>
<session_timeout_ms>100000</session_timeout_ms>
<raft_logs_level>information</raft_logs_level>
<compress_logs>false</compress_logs>
</coordination_settings>
In most cases, it is sufficient to use the recommended default values. Modifying them is only advisable in the event of specific issues (frequent leader re-elections, unstable network, very large metadata volume).
Configuration Placement and Starting Keeper
Standalone clickhouse-keeper service
In this scenario:
- main configuration file:
/etc/clickhouse-keeper/keeper_config.xml - additional files:
/etc/clickhouse-keeper/keeper_config.d/*.xmlor*.yaml
After configuring, execute the following commands:
sudo systemctl enable clickhouse-keeper
sudo systemctl start clickhouse-keeper
sudo systemctl status clickhouse-keeper
Alternatively, you can start Keeper directly:
clickhouse-keeper --config /etc/clickhouse-keeper/keeper_config.xml
# or
clickhouse keeper --config /etc/clickhouse-keeper/keeper_config.xml
Embedded Keeper within the clickhouse-server process
If Keeper runs as part of clickhouse-server, the <keeper_server> block is added to the server's configuration:
<clickhouse>
<!-- ... other ClickHouse settings ... -->
<keeper_server> ... </keeper_server>
</clickhouse>
After updating the configuration, simply restart the server:
sudo systemctl restart clickhouse-server
In this case, the ClickHouse and Keeper processes are shared. In production environments, this approach is typically used only for small test setups.
Integrating ClickHouse Keeper with ClickHouse Server
From the perspective of ClickHouse servers, ClickHouse Keeper appears as a ZooKeeper-compatible service. On each ClickHouse node, the coordinators must be specified in the <zookeeper> section:
<clickhouse>
<!-- ... -->
<zookeeper>
<node>
<host>ch-keeper-01</host>
<port>9181</port>
</node>
<node>
<host>ch-keeper-02</host>
<port>9181</port>
</node>
<node>
<host>ch-keeper-03</host>
<port>9181</port>
</node>
</zookeeper>
<!-- Macros for configuring replicated tables -->
<macros>
<cluster>cluster_name</cluster>
<shard>01</shard>
<replica>01</replica>
</macros>
</clickhouse>
It is important that the list of nodes in <zookeeper> matches the actual Keeper cluster configuration (the same hostname and tcp_port specified in <keeper_server>).
The <listen_host> parameter and IPv6 specifics
By default, ClickHouse creates several listen_host entries to listen on the local interface via IPv4 and IPv6 (127.0.0.1 and ::1). If connections from other hosts need to be accepted, the following line is typically added:
<listen_host>0.0.0.0</listen_host>
This is a wildcard address for IPv4 only: the server will listen on all IPv4 interfaces of the node but will not open ports for IPv6. For systems using only IPv4, this option is preferable and additionally avoids a number of IPv6-related issues.
On some distributions, IPv6 is disabled or not configured in the kernel. In such cases, attempting to listen on the IPv6 wildcard address may lead to errors.
<listen_host>::</listen_host>
If the server has no actual IPv6 support, it is recommended to:
-
avoid using
<listen_host>::</listen_host> -
explicitly specify the IPv4 option:
<listen_host>0.0.0.0</listen_host> -
additionally, disable IPv6 in the ClickHouse/Keeper configuration if necessary:
<enable_ipv6>false</enable_ipv6>
This will force ClickHouse to listen only on IPv4 interfaces and eliminate errors related to the absence of IPv6.
If IPv6 is used in the infrastructure and the server must accept connections via both protocols, you can use the IPv6 wildcard address instead of 0.0.0.0:
<listen_host>::</listen_host>
In this case, ClickHouse will open ports on all IPv6 and IPv4 interfaces (provided IPv6 is enabled in the system). This mode should only be enabled in conjunction with proper firewall configuration, network ACLs, and user access policies.
Verifying ClickHouse Keeper Functionality
Via ClickHouse
From the perspective of ClickHouse servers, the Keeper state can be checked by querying the system.zookeeper system table:
SELECT *
FROM system.zookeeper
WHERE path IN ('/', '/clickhouse');
The presence of nodes /clickhouse and service branches (e.g., /clickhouse/task_queue/ddl) indicates that Keeper is accessible and being used for coordination.
Via the clickhouse-keeper-client utility
ClickHouse includes a console utility, clickhouse-keeper-client, which can work with Keeper via its native protocol:
clickhouse-keeper-client -h ch-keeper-01 -p 9181
In interactive mode, commands similar to those in the ZooKeeper client are available:
ls /— view child znodesget '/clickhouse'— read node valueset '/clickhouse/test' 'value'— write valueexists '/clickhouse'— check node existence
This is a convenient way to verify Keeper availability and diagnose the znode tree contents.
Four-letter Commands
Like ZooKeeper, ClickHouse Keeper supports a set of "four-letter" commands, which are sent to the client port via TCP, e.g., using nc:
echo mntr | nc ch-keeper-01 9181
- the
mntrcommand outputs metric values: node state (leader/follower), number of connections, latencies, etc. - the
statcommand shows a brief summary of the server and client state, andruokchecks service availability (returnsimokif operational)
Operational Recommendations
- for production clusters, a minimum of 3 ClickHouse Keeper nodes on separate hosts or containers is recommended
- when changing the cluster topology (adding/removing Keeper nodes), carefully ensure that the uniqueness of
server_idand theserver_id ↔ hostnamemapping are maintained across all configurations