Skip to main content
Version: 5.3

Configuring ClickHouse Keeper

ClickHouse Keeper is a built-in distributed coordination service for ClickHouse, responsible for replication and distributed DDL query execution. Keeper is fully protocol-compatible with ZooKeeper but is implemented in C++ and uses the RAFT consensus algorithm, ensuring linearizable writes and more predictable behavior during failures.

When ClickHouse Keeper is Needed

Separate configuration of ClickHouse Keeper is only required in cluster scenarios: when you have multiple ClickHouse nodes and are using replicated tables (ReplicatedMergeTree) or ON CLUSTER distributed queries. In a single-server installation, Keeper is not mandatory and can be omitted.


Keeper Deployment Options

ClickHouse Keeper can operate in two modes:

  1. As a standalone service. It is installed via the clickhouse-keeper and runs as a separate process with its own configuration file:

    • main configuration file: /etc/clickhouse-keeper/keeper_config.xml
    • additional files: /etc/clickhouse-keeper/keeper_config.d/*.xml or *.yaml
  2. As part of the clickhouse-server process. In this case, the <keeper_server> configuration block is added to the main server configuration:

    • main file: /etc/clickhouse-server/config.xml
    • or a separate file in /etc/clickhouse-server/config.d/keeper.xml

The recommended approach for production is to use separate ClickHouse Keeper nodes and separate configuration files in /etc/clickhouse-keeper/ to independently scale and maintain the coordination cluster.


Basic ClickHouse Keeper Configuration Structure

The main configuration block for Keeper is the <keeper_server>. element. Typically, it includes:

Keeper Configuration Block
<clickhouse>
<logger>
<level>trace</level>
<log>/var/log/clickhouse-keeper/clickhouse-keeper.log</log>
<errorlog>/var/log/clickhouse-keeper/clickhouse-keeper.err.log</errorlog>
<size>1000M</size>
<count>10</count>
</logger>

<max_connections>4096</max_connections>

<listen_host>0.0.0.0</listen_host>

<keeper_server>
<!-- Port for client (ClickHouse servers or applications) connections to Keeper -->
<tcp_port>9181</tcp_port>

<!-- Unique identifier of the Keeper node in the cluster -->
<server_id>1</server_id>

<log_storage_path>/var/lib/clickhouse/coordination/logs</log_storage_path>
<snapshot_storage_path>/var/lib/clickhouse/coordination/snapshots</snapshot_storage_path>

<!-- Internal coordination settings -->
<coordination_settings>
<operation_timeout_ms>10000</operation_timeout_ms>
<min_session_timeout_ms>10000</min_session_timeout_ms>
<session_timeout_ms>100000</session_timeout_ms>
<raft_logs_level>information</raft_logs_level>
<compress_logs>false</compress_logs>
</coordination_settings>

<!-- enable sanity hostname checks for cluster configuration (e.g. if localhost is used with remote endpoints) -->
<hostname_checks_enabled>true</hostname_checks_enabled>

<!-- Description of all Keeper nodes participating in the quorum -->
<raft_configuration>
<server>
<id>1</id>
<!-- Internal port and hostname -->
<hostname>ch-keeper-01</hostname>
<port>9234</port>
</server>
<server>
<id>2</id>
<!-- Internal port and hostname -->
<hostname>ch-keeper-02</hostname>
<port>9234</port>
</server>
<server>
<id>3</id>
<!-- Internal port and hostname -->
<hostname>ch-keeper-03</hostname>
<port>9234</port>
</server>

<!-- Add more servers here -->

</raft_configuration>
</keeper_server>


<openSSL>
<server>
<!-- Used for secure tcp port -->
<!-- openssl req -subj "/CN=localhost" -new -newkey rsa:2048 -days 365 -nodes -x509 -keyout /etc/clickhouse-server/server.key -out /etc/clickhouse-server/server.crt -->
<!-- <certificateFile>/etc/clickhouse-keeper/server.crt</certificateFile> -->
<!-- <privateKeyFile>/etc/clickhouse-keeper/server.key</privateKeyFile> -->
<!-- dhparams are optional. You can delete the <dhParamsFile> element.
To generate dhparams, use the following command:
openssl dhparam -out /etc/clickhouse-keeper/dhparam.pem 4096
Only file format with BEGIN DH PARAMETERS is supported.
-->
<!-- <dhParamsFile>/etc/clickhouse-keeper/dhparam.pem</dhParamsFile> -->
<verificationMode>none</verificationMode>
<loadDefaultCAFile>true</loadDefaultCAFile>
<cacheSessions>true</cacheSessions>
<disableProtocols>sslv2,sslv3</disableProtocols>
<preferServerCiphers>true</preferServerCiphers>
</server>
</openSSL>

</clickhouse>

Key parameters:

  • tcp_port — port for client connections (ClickHouse servers, clickhouse-keeper-client, utilities). Recommended value: 9181 (to avoid conflict with the standard ZooKeeper port 2181)
  • server_id — unique numeric identifier for the Keeper node. Values must be unique. A simple sequence like 1, 2, 3, ... is recommended
  • log_storage_path — path for RAFT coordination logs
  • snapshot_storage_path— directory for state snapshots (compressed znode tree state)
  • coordination_settings — detailed configuration for timeouts, heartbeat frequency, snapshot, and log parameters. In most cases, the basic values from the example are sufficient
  • raft_configuration — description of all participants in the RAFT quorum
An Odd Number of Nodes is Recommended

The Keeper cluster should consist of an odd number of nodes (typically 3 or 5). This is necessary to achieve quorum and improve fault tolerance.


Configuring the RAFT Quorum (<raft_configuration>)

The <raft_configuration> block describes all Keeper nodes participating in the RAFT quorum:

<raft_configuration>
<secure>false</secure>

<server>
<id>1</id>
<hostname>ch-keeper-01</hostname>
<port>9234</port>
</server>
<server>
<id>2</id>
<hostname>ch-keeper-02</hostname>
<port>9234</port>
</server>
<server>
<id>3</id>
<hostname>ch-keeper-03</hostname>
<port>9234</port>
</server>
</raft_configuration>

For each <server>, the following is defined:

  • id — the server's identifier within the RAFT quorum. This must match the server_id in the <keeper_server> block on the corresponding node
  • hostname — the hostname by which other nodes can contact this Keeper. Using DNS names rather than IP addresses is recommended to maintain a stable server_id ↔ hostname mapping
  • port — the port for internal communication between Keeper nodes (inter-server RAFT port). This is different from the tcp_port used by clients
Identifier stability

When replacing or migrating a Keeper node, it is crucial not to reuse an old server_id for a different physical server and not to "mix up" the server_id ↔ hostname mapping. This is critical for the correctness of the RAFT quorum.


Internal Coordination Settings

The <coordination_settings> block manages timeouts and RAFT operational parameters:

<coordination_settings>
<operation_timeout_ms>10000</operation_timeout_ms>
<min_session_timeout_ms>10000</min_session_timeout_ms>
<session_timeout_ms>100000</session_timeout_ms>
<raft_logs_level>information</raft_logs_level>
<compress_logs>false</compress_logs>
</coordination_settings>

In most cases, it is sufficient to use the recommended default values. Modifying them is only advisable in the event of specific issues (frequent leader re-elections, unstable network, very large metadata volume).


Configuration Placement and Starting Keeper

Standalone clickhouse-keeper service

In this scenario:

  • main configuration file: /etc/clickhouse-keeper/keeper_config.xml
  • additional files: /etc/clickhouse-keeper/keeper_config.d/*.xml or *.yaml

After configuring, execute the following commands:

sudo systemctl enable clickhouse-keeper
sudo systemctl start clickhouse-keeper
sudo systemctl status clickhouse-keeper

Alternatively, you can start Keeper directly:

clickhouse-keeper --config /etc/clickhouse-keeper/keeper_config.xml
# or
clickhouse keeper --config /etc/clickhouse-keeper/keeper_config.xml

Embedded Keeper within the clickhouse-server process

If Keeper runs as part of clickhouse-server, the <keeper_server> block is added to the server's configuration:

<clickhouse>
<!-- ... other ClickHouse settings ... -->

<keeper_server> ... </keeper_server>
</clickhouse>

After updating the configuration, simply restart the server:

sudo systemctl restart clickhouse-server

In this case, the ClickHouse and Keeper processes are shared. In production environments, this approach is typically used only for small test setups.


Integrating ClickHouse Keeper with ClickHouse Server

From the perspective of ClickHouse servers, ClickHouse Keeper appears as a ZooKeeper-compatible service. On each ClickHouse node, the coordinators must be specified in the <zookeeper> section:

<clickhouse>
<!-- ... -->

<zookeeper>
<node>
<host>ch-keeper-01</host>
<port>9181</port>
</node>
<node>
<host>ch-keeper-02</host>
<port>9181</port>
</node>
<node>
<host>ch-keeper-03</host>
<port>9181</port>
</node>
</zookeeper>

<!-- Macros for configuring replicated tables -->
<macros>
<cluster>cluster_name</cluster>
<shard>01</shard>
<replica>01</replica>
</macros>
</clickhouse>

It is important that the list of nodes in <zookeeper> matches the actual Keeper cluster configuration (the same hostname and tcp_port specified in <keeper_server>).


The <listen_host> parameter and IPv6 specifics

By default, ClickHouse creates several listen_host entries to listen on the local interface via IPv4 and IPv6 (127.0.0.1 and ::1). If connections from other hosts need to be accepted, the following line is typically added:

<listen_host>0.0.0.0</listen_host>

This is a wildcard address for IPv4 only: the server will listen on all IPv4 interfaces of the node but will not open ports for IPv6. For systems using only IPv4, this option is preferable and additionally avoids a number of IPv6-related issues.

IPv6 and ClickHouse Startup Issues

On some distributions, IPv6 is disabled or not configured in the kernel. In such cases, attempting to listen on the IPv6 wildcard address may lead to errors.

<listen_host>::</listen_host>

If the server has no actual IPv6 support, it is recommended to:

  • avoid using <listen_host>::</listen_host>

  • explicitly specify the IPv4 option:

    <listen_host>0.0.0.0</listen_host>
  • additionally, disable IPv6 in the ClickHouse/Keeper configuration if necessary:

    <enable_ipv6>false</enable_ipv6>

This will force ClickHouse to listen only on IPv4 interfaces and eliminate errors related to the absence of IPv6.

Simultaneous IPv4 and IPv6 Support

If IPv6 is used in the infrastructure and the server must accept connections via both protocols, you can use the IPv6 wildcard address instead of 0.0.0.0:

<listen_host>::</listen_host>

In this case, ClickHouse will open ports on all IPv6 and IPv4 interfaces (provided IPv6 is enabled in the system). This mode should only be enabled in conjunction with proper firewall configuration, network ACLs, and user access policies.


Verifying ClickHouse Keeper Functionality

Via ClickHouse

From the perspective of ClickHouse servers, the Keeper state can be checked by querying the system.zookeeper system table:

SELECT *
FROM system.zookeeper
WHERE path IN ('/', '/clickhouse');

The presence of nodes /clickhouse and service branches (e.g., /clickhouse/task_queue/ddl) indicates that Keeper is accessible and being used for coordination.

Via the clickhouse-keeper-client utility

ClickHouse includes a console utility, clickhouse-keeper-client, which can work with Keeper via its native protocol:

clickhouse-keeper-client -h ch-keeper-01 -p 9181

In interactive mode, commands similar to those in the ZooKeeper client are available:

  • ls / — view child znodes
  • get '/clickhouse' — read node value
  • set '/clickhouse/test' 'value' — write value
  • exists '/clickhouse' — check node existence

This is a convenient way to verify Keeper availability and diagnose the znode tree contents.

Four-letter Commands

Like ZooKeeper, ClickHouse Keeper supports a set of "four-letter" commands, which are sent to the client port via TCP, e.g., using nc:

echo mntr | nc ch-keeper-01 9181
  • the mntr command outputs metric values: node state (leader/follower), number of connections, latencies, etc.
  • the stat command shows a brief summary of the server and client state, and ruok checks service availability (returns imok if operational)

Operational Recommendations

  • for production clusters, a minimum of 3 ClickHouse Keeper nodes on separate hosts or containers is recommended
  • when changing the cluster topology (adding/removing Keeper nodes), carefully ensure that the uniqueness of server_id and the server_id ↔ hostname mapping are maintained across all configurations