Configuring Smart Monitor Cluster Snapshots
Snapshots are used for backing up indexes and restoring data in case of failures.
Information
Conventions:
OS_HOME- Smart Monitor Data Storage home directory, usually/app/opensearch/OS_IP- IP address of one of the OpenSearch cluster serversBACKUP_DIR- directory for storing snapshotsREPO_NAME- snapshot repository namingSNAPSHOT_POLICY_NAME- snapshot policy namingSNAPSHOT_NAME- snapshot nameHADOOP_HOME- Apache Hadoop home directoryHADOOP_DATA- Hadoop data storage directoryHADOOP_HOST- HDFS node addressHADOOP_USER- Hadoop user
Preparing Cluster Nodes
Preparation must be performed on all nodes with the data role.
For clusters consisting of multiple nodes, it is recommended to disable allocation before preparing nodes through the developer console (Main menu - System settings - Developer console) by executing the command:
PUT _cluster/settings
{
"persistent": {
"cluster.routing.allocation.enable": "none"
}
}
The same can be done from the terminal with the following command:
curl -XPUT -k -u admin "https://$OS_IP:9200/_cluster/settings?pretty" -H "Content-Type: application/json" -d '{"persistent":{"cluster.routing.allocation.enable": "none"}}'
After completing preparation of all cluster nodes, enable allocation:
PUT _cluster/settings
{
"persistent": {
"cluster.routing.allocation.enable": "all"
}
}
The same can be done from the terminal with the following command:
curl -XPUT -k -u admin "https://$OS_IP:9200/_cluster/settings?pretty" -H "Content-Type: application/json" -d '{"persistent":{"cluster.routing.allocation.enable": "all"}}'
FS Storage Type
Before configuring snapshots, you need to create a directory where Smart Monitor will save backups. To do this, open a terminal under the root user to execute commands. You need to create a directory and configure its permissions:
- Create a directory for storing snapshots and grant read and write permissions to the
opensearchuser:
mkdir -p $BACKUP_DIR
chown -R opensearch:opensearch $BACKUP_DIR
- Configure the node configuration
$OS_HOME/opensearch.ymlwith any convenient editor by adding thepath.repoparameter with the path to the created directory:
path.repo: ["{BACKUP_DIR}"]
It is not recommended to place path.repo on the same disk where node data is stored.
The specified path must be the same on all nodes participating in creating snapshots. If at least one node does not have access, the save operation may fail.
- Restart the node for the changes to take effect:
systemctl restart opensearch
HDFS Storage Type
To use an HDFS cluster as a snapshot repository, you need to install the repository-hdfs plugin in opensearch and reload the service:
$OS_HOME/bin/opensearch-plugin install repository-hdfs
systemctl restart opensearch
Deploy on the node selected as the storage Apache Hadoop:
- Download the Apache Hadoop archive using the
wgetcommand:
wget https://dlcdn.apache.org/hadoop/common/hadoop-3.4.2/hadoop-3.4.2.tar.gz
- Create folders
$HADOOP_HOMEand$HADOOP_DATA, extract the archive and grant permissions to user$HADOOP_USER:
mkdir -p $HADOOP_HOME
mkdir -p $HADOOP_DATA
tar -xzf hadoop-3.4.2.tar.gz -C $HADOOP_HOME --strip-components 1
chown -R $HADOOP_USER:$HADOOP_USER $HADOOP_HOME
chown -R $HADOOP_USER:$HADOOP_USER $HADOOP_DATA
- For compiling Apache Hadoop, it is recommended to use java 8
Install openjdk-8-jdk:
apt update
apt install openjdk-8-jdk
HDFS Configuration
All subsequent operations must be performed under the $HADOOP_USER account.
- Configure the environment file
$HADOOP_HOME/etc/hadoop/hadoop-env.shwith any convenient editor, uncomment the line# export JAVA_HOME=and add the path to java 8
export JAVA_HOME={JAVA_HOME}
- Edit the file
$HADOOP_HOME/etc/hadoop/core-site.xml, adding the following setting:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://<HADOOP_HOST>:<PORT></value>
</property>
</configuration>
The port must be either 9000 or 8020. Hadoop operation is not guaranteed with other ports.
- Next, open the file
$HADOOP_HOME/etc/hadoop/core-site.xmland add paths for data storage:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>{HADOOP_DATA}/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>{HADOOP_DATA}/datanode</value>
</property>
</configuration>
- For hadoop to work, it needs the ability to ssh connect to localhost without an access key. Check the connection with the following command:
ssh localhost
If a password is requested when connecting, use the following commands:
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 0600 ~/.ssh/authorized_keys
- Format the storage before starting work:
$HADOOP_HOME/bin/hdfs namenode -format
- Configure variables for
$HADOOP_USERin the file/etc/profile.d/hadoop.sh:
export HADOOP_HOME={HADOOP_HOME}
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export HDFS_NAMENODE_USER=$HADOOP_USER
export HDFS_DATANODE_USER=$HADOOP_USER
export HDFS_SECONDARYNAMENODE_USER=$HADOOP_USER
- Start DFS with the
start-dfs.shfile:
$HADOOP_HOME/sbin/start-dfs.sh
Check the startup with the jps command. If NameNode, DataNode and SecondaryNameNode are present in the output, the startup was successful. The manager is available at http://<HADOOP_HOST>:9870, the file system can be viewed at Utilities->Browse the file system.
- Configure a folder for storing snapshots in HDFS (the path
/{REPO_NAME}/{CLUSTER_NAME}is chosen as an example) and grant access rights toopensearch:
hdfs dfs -mkdir -p /{REPO_NAME}/{CLUSTER_NAME}
hdfs dfs -chown -R opensearch:supergroup /{REPO_NAME}
Preparing Snapshot Repository
You need to create a snapshot repository with the necessary parameters. FS, S3, HDFS and other types of storage are supported.
FS Repository
PUT /_snapshot/{REPO_NAME}
{
"type": "fs",
"settings": {
"location": "{BACKUP_DIR}"
}
}
Table of all FS snapshot repository parameters:
| Parameter | Purpose |
|---|---|
location | Directory for storing snapshots. |
chunk_size | Splits large files into fragments when creating snapshots (64 MB, 1 GB, etc.). Default - 1gb. Optional parameter. |
compress | Boolean Whether to compress metadata files. Default - false. Optional parameter. |
max_restore_bytes_per_sec | Maximum snapshot restore speed. Default - 40 MB/s. Optional parameter. |
max_snapshot_bytes_per_sec | Maximum snapshot creation speed. Default - 40 MB/s. Optional parameter. |
remote_store_index_shallow_copy | Boolean. Determines whether index snapshots will be written as a shallow copy. Default - false. Optional parameter. |
shallow_snapshot_v2 | Boolean. Determines whether index snapshots will be written as a second version shallow copy. Default - false. Optional parameter. |
readonly | Boolean. Whether the repository is read-only. Default - false. Optional parameter. |
HDFS Repository
PUT _snapshot/{REPO_NAME}
{
"type": "hdfs",
"settings": {
"uri": "hdfs://<HADOOP_HOST>:<PORT>/",
"path": "/{REPO_NAME}/{CLUSTER_NAME}"
}
}
Table of HDFS snapshot repository parameters:
| Parameter | Purpose |
|---|---|
uri | Directory for storing snapshots. Required parameter |
path | Path in the HDFS file system where snapshots will be stored. Required parameter |
load_defaults | Whether to load default Hadoop configurations from classpath. Default - true Optional parameter |
conf.<key> | Allows passing any specific Hadoop settings. The full list is in core and hdfs parameter lists. Optional parameter |
compress | Compress metadata files. Default - true. Optional parameter |
readonly | Make the repository read-only. Optional parameter |
Using HDFS as a snapshot storage allows using any snapshots as a recovery point for any cluster. To do this, you need to add a repository with any name different from $REPO_NAME with the readonly parameter to prevent accidental writing:
PUT _snapshot/{SEARCHABLE_REPO_NAME}
{
"type": "hdfs",
"settings": {
"uri": "hdfs://<HADOOP_HOST>:<PORT>/",
"path": "/{REPO_NAME}/{SEARCHABLE_CLUSTER_NAME}",
"readonly": true
}
}
Setting Up Automatic Snapshots
To automate cluster snapshot saving, you need to create a snapshot policy through the developer console (Navigation menu - System settings - Developer console) by executing a command with the necessary parameters:
POST _plugins/_sm/policies/{SNAPSHOT_POLICY_NAME}
{
"name": "snapshot-daily-{{date}}",
"description": "Daily snapshot policy",
"creation": {
"schedule": {
"cron": {
"expression": "0 8 * * *",
"timezone": "UTC"
}
},
"time_limit": "1h"
},
"deletion": {
"schedule": {
"cron": {
"expression": "0 8 * * *",
"timezone": "UTC"
}
},
"condition": {
"max_age": "7d",
"max_count": 50,
"min_count": 30
},
"time_limit": "1h"
},
"snapshot_config": {
"date_format": "yyyy-MM-dd-HH:mm",
"timezone": "UTC",
"indices": [".*"],
"repository": "{REPO_NAME}",
"ignore_unavailable": "true",
"include_global_state": "false",
"partial": "true",
"metadata": {
"any_key": "any_value"
}
}
}
Table of snapshot policy parameters:
Table of snapshot policy parameters:
| Parameter | Type | Description |
|---|---|---|
description | String | Description of the snapshot policy. Optional parameter. |
enabled | Boolean | Should the policy be enabled when created? Optional parameter. |
snapshot_config | Object | Policy creation settings. Required parameter. |
snapshot_config.date_format | String | Snapshot names have the format {SNAPSHOT_POLICY_NAME}-<date>-<random number>. date_format Defines the date format in snapshot names. Optional parameter. Default - yyyy-MM-dd'T'HH:mm:ss. |
snapshot_config.date_format_timezone | String | Snapshot names have the format {SNAPSHOT_POLICY_NAME}-<date>-<random number>. date_format_timezone defines the timezone for the date in snapshot names. Optional parameter. Default - UTC. |
snapshot_config.indices | String | Pattern of indexes saved in snapshots. Default - * (all indexes). |
snapshot_config.repository | String | Repository where snapshots will be saved. Required parameter. |
snapshot_config.ignore_unavailable | Boolean | Determines whether to ignore unavailable indexes. Optional parameter. Default - false. |
snapshot_config.include_global_state | Boolean | Determines whether to include cluster state in the snapshot. Optional parameter. Default - true. |
snapshot_config.partial | Boolean | Determines the possibility of creating an incomplete snapshot. Optional parameter. Default - false. |
snapshot_config.metadata | Object | Metadata in key/value format. Optional parameter. |
creation | Object | Snapshot creation settings. Required parameter. |
creation.schedule | String | Snapshot creation schedule in cron format. Required parameter. |
creation.time_limit | String | Defines the maximum waiting time for snapshot creation to complete. If time_limit is longer than the creation schedule interval, the snapshot will not be created until time_limit expires. Optional parameter. |
deletion | Object | Snapshot deletion settings. Optional parameter. By default stores all snapshots. |
deletion.schedule | String | Snapshot deletion schedule in cron format. Optional parameter. By default uses the creation.schedule parameter settings. |
deletion.time_limit | String | Defines the maximum waiting time for snapshot deletion to complete. Optional parameter. |
deletion.delete_condition | Object | Snapshot deletion conditions. Optional parameter. |
deletion.delete_condition.max_count | Integer | Maximum number of stored snapshots. Optional parameter. |
deletion.delete_condition.max_age | String | Maximum time snapshots are stored. Optional parameter. |
deletion.delete_condition.min_count | Integer | Minimum number of stored snapshots. Optional parameter. Default - 1. |
notification | Object | Contains notification settings for snapshot policy events (requires a configured OpenSearch notification channel). Optional parameter. |
notification.channel | Object | Defines the notification channel. Required parameter. |
notification.channel.id | String | Notification channel ID. Required parameter. |
notification.conditions | Object | Snapshot policy events that require notifications - this requires setting the value to true. |
notification.conditions.creation | Boolean | Determines the need for notifications about snapshot creation. Optional parameter. Default - true. |
notification.conditions.deletion | Boolean | Determines the need for notifications about snapshot deletion. Optional parameter. Default - false. |
notification.conditions.failure | Boolean | Determines the need for notifications about snapshot creation or deletion errors. Optional parameter. Default - false. |
notification.conditions.time_limit_exceeded | Boolean | Determines the need for notifications about exceeding time_limit for snapshot-related operations. Optional parameter. Default - false. |
Snapshots are incremental - already saved segments are not saved again.
Manual Snapshot Creation
To create a one-time snapshot, execute the command through the developer console:
PUT _snapshot/{REPO_NAME}/{SNAPSHOT_NAME}
{
"indices": "*",
"ignore_unavailable": true,
"include_global_state": false
}
Restoring from Snapshot
If planning to restore data on a target cluster from another cluster's snapshot, ensure that the OpenSearch versions on the clusters match.
- View available snapshots through the developer console with the command:
GET _snapshot/{REPO_NAME}/_all
As a result of executing the command, a list of all created snapshots will be displayed in the following format, where the snapshot field is the snapshot name (SNAPSHOT_NAME):
{
"snapshots": [
{
"snapshot": "daily-snapshot-sm-policy-2025-04-21-14:30-bu2lfnek",
"uuid": "iqHGMwR5T6yV-tInX3a5KQ",
"version_id": 136397827,
"version": "2.18.0",
"remote_store_index_shallow_copy": false,
"indices": [
"index1",
"index2"
],
"data_streams": [],
"include_global_state": true,
"metadata": {
"sm_policy": "daily-snapshot-sm-policy"
},
"state": "SUCCESS",
"start_time": "2025-04-21T14:30:02.327Z",
"start_time_in_millis": 1745245802327,
"end_time": "2025-04-21T14:31:09.966Z",
"end_time_in_millis": 1745245869966,
"duration_in_millis": 67639,
"failures": [],
"shards": {
"total": 2,
"failed": 0,
"successful": 2
}
}
]
}
- Restore the desired snapshot through the developer console with the command:
POST _snapshot/{REPO_NAME}/{SNAPSHOT_NAME}/_restore
Deleting Snapshot
- View available snapshots through the developer console with the command:
GET _snapshot/{REPO_NAME}/_all
As a result of executing the command, a list of all created snapshots will be displayed in the format specified in the section Restoring from Snapshot.
- Delete the desired snapshot through the developer console with the command:
DELETE _snapshot/{REPO_NAME}/{SNAPSHOT_NAME}