One of our customers has configured OpenShift's log store to send a copy of various monitoring data to an external Elasticsearch cluster. Due to a problem that occurred in this customer's environment, where part of the data from its external Elasticsearch cluster was lost, it was necessary to develop a way to copy the missing data, through a backup and restore process.
There are several ways to make backups as well as direct data transfers from one Elasticsearch Cluster to another, which allows good flexibility to meet different scenarios. We will see some of them in detail in this article.
Missing Data on External Elasticsearch Cluster
Requirements
A route to internal OCP Elasticsearch. You can click here to learn about this.
The following package must be installed on one of Elasticsearch External hosts:
- Node.js Elasticsearch Dump.
NOTE: In this article, the host selected was Kibana, as it already has NodeJS packages installed.
Assumptions
The acronym "ES" means "Elasticsearch".
For ease of understanding, assume the following URLs as a base for the example environment:
- OpenShift Logging:
https://elasticsearch-openshift-logging.apps.homelab.rhbrlabs.com - External ES:
http://es.rhbrlabs.com:9202/
Access Token
We will need a user token with access to Elasticsearch from OpenShift.
OCP Bastion
The following procedures must be run from within the OpenShift Bastion.
Get Internal ES IP
Internally, you can access the log store service using the log store cluster IP, which you can get by using either of the following commands:
[root@bastion ~]# echo $(oc get service elasticsearch -o jsonpath={.spec.clusterIP} -n openshift-logging)
172.31.58.140
Or:
[root@bastion ~]# oc get service elasticsearch -n openshift-logging
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
elasticsearch ClusterIP 172.31.58.140 <none> 9200/TCP 15d
Expose Log Store
To expose the log store externally, first alternate to "openshift-logging" project:
[root@bastion ~]# oc project openshift-logging
Extract the CA certificate from the log store and write to the admin-ca file:
[root@bastion ~]# oc extract secret/elasticsearch --to=. --keys=admin-ca
admin-ca
Route for the Log Store Service
Create a YAML internal-es-route.yaml file with the following content:
apiVersion: route.openshift.io/v1
kind: Route
metadata:
name: elasticsearch
namespace: openshift-logging
spec:
host:
to:
kind: Service
name: elasticsearch
tls:
termination: reencrypt
destinationCACertificate: |
Attention: the file should end with the "|" character, and preserve the indentation.
Now, run the following command to add the log store CA certificate to the route YAML you created in the previous step:
[root@bastion ~]# cat ./admin-ca | sed -e "s/^/ /" >> internal-es-route.yaml
Check if the file is similar to this:
[root@bastion ~]# cat internal-es-route.yaml
apiVersion: route.openshift.io/v1
kind: Route
metadata:
name: elasticsearch
namespace: openshift-logging
spec:
host:
to:
kind: Service
name: elasticsearch
tls:
termination: reencrypt
destinationCACertificate: |
-----BEGIN CERTIFICATE-----
MIIFNzCCAx+gAwIBAgIUTXUBAGG84VFHUe3/o7tY5z3T9ZowDQYJKoZIhvcNAQEL
BQAwKzEpMCcGA1UEAwwgb3BlbnNoaWZ0LWNsdXN0ZXItbG9nZ2luZy1zaWduZXIw
HhcNMjIwMzA3MjAxMzM4WhcNMjcwMzA2MjAxMzM4WjArMSkwJwYDVQQDDCBvcGVu
c2hpZnQtY2x1c3Rlci1sb2dnaW5nLXNpZ25lcjCCAiIwDQYJKoZIhvcNAQEBBQAD
(...)
RJm3HFBqgu4zNf+dReKiJBZqdTaVFRJqDgRwWX7vA31S7DTadPM6VcPxm0YxqK++
7dAEfqVkrD3bj46324AwUXCExIKvR/vRd20y1PD2gaONkDssaebfCTHi8MP17GcE
cDGmWbKqHuSQLwCbk0ogVSwNFOdqsMOS5rYvdalIHE2l+DOFeuo6OM6/zsE/1hTD
DZt6md8mkvXmUpK34Wtl46utmguv6fBZ6hb3O+NMOe8zOPa8GV/HU5E5Ew==
-----END CERTIFICATE-----
Create the route:
[root@bastion ~]# oc create -f internal-es-route.yaml
route.route.openshift.io/elasticsearch created
Get the user token:
[root@bastion ~]# token=$(oc whoami -t)
[root@bastion ~]# echo $token
sha256~0JtosvhtA7YwbTx-UPlhRknHLydgP1Iov2YIVN_duCw
Tip: To use elasticsearch SA token:
$ oc sa get-token elasticsearch
Set the elasticsearch route you created as an environment variable.
[root@bastion ~]# routeES=$(oc get route elasticsearch -o jsonpath={.spec.host})
[root@bastion ~]# echo $routeES
elasticsearch-openshift-logging.apps.homelab.rhbrlabs.com
To verify the route was successfully created, run the following command that accesses Elasticsearch through the exposed route:
[root@bastion ~]# curl -tlsv1.2 --insecure -H "Authorization: Bearer ${token}" "https://${routeES}"
{
"name" : "elasticsearch-cdm-1tsq0edh-2",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "muRod1gOSlKkVPkj6QSqFA",
"version" : {
"number" : "6.8.1",
"build_flavor" : "oss",
"build_type" : "zip",
"build_hash" : "db90ff8",
"build_date" : "2022-02-02T20:21:15.875200Z",
"build_snapshot" : false,
"lucene_version" : "7.7.0",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
},
"tagline" : "You Know, for Search"
}
External ES Kibana
The following procedures must be run from within the External Elasticsearch Kibana.
Check Access to OCP's ES
From the external ES Kibana host, declare a variable containing the token:
[root@kibana ~]# token=sha256~0JtosvhtA7YwbTx-UPlhRknHLydgP1Iov2YIVN_duCw
Declare a variable containing the OCP ES Route Name:
[root@kibana ~]# routeES='elasticsearch-openshift-logging.apps.homelab.rhbrlabs.com'
Check the communication between External ES and Internal OCP's ES:
[root@kibana ~]# curl -tlsv1.2 --insecure -H "Authorization: Bearer ${token}" "https://${routeES}"
{
"name" : "elasticsearch-cdm-1tsq0edh-1",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "muRod1gOSlKkVPkj6QSqFA",
"version" : {
"number" : "6.8.1",
"build_flavor" : "oss",
"build_type" : "zip",
"build_hash" : "db90ff8",
"build_date" : "2022-02-02T20:21:15.875200Z",
"build_snapshot" : false,
"lucene_version" : "7.7.0",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
},
"tagline" : "You Know, for Search"
}
OK. Now you have access to OCP's internal ES. Proceed to the next part.
Install Elasticsearchdump
On the external Kibana host, use NPM to install elasticsearchdump:
[root@kibana ~]# npm install elasticdump
npm WARN deprecated request@2.88.2: request has been deprecated, see https://github.com/request/request/issues/3142
npm WARN deprecated querystring@0.2.0: The querystring API is considered Legacy. new code should use the URLSearchParams API instead.
(...)
npm WARN root No description
npm WARN root No repository field.
npm WARN root No README data
npm WARN root No license field.
+ elasticdump@6.82.1
added 111 packages from 194 contributors and audited 111 packages in 13.671s
(...)
Data Copy
The following procedures must be run from within the External Elasticsearch Kibana.
Index List
Get a list of available indexes:
[root@kibana ~]# curl -k -XGET -tlsv1.2 --insecure -H "Authorization: Bearer ${token}" "https://${routeES}/_aliases?pretty"
{
".kibana_1" : {
"aliases" : {
".kibana" : { }
}
},
".kibana_92668751_admin_1" : {
"aliases" : {
".kibana_92668751_admin" : { }
}
},
".security" : {
"aliases" : { }
},
"app-000001" : {
"aliases" : {
".all" : { },
"app" : { },
"app-write" : {
"is_write_index" : false
},
"logs.app" : { }
}
},
"app-000002" : {
"aliases" : {
".all" : { },
"app" : { },
"app-write" : {
"is_write_index" : false
},
"logs.app" : { }
(...)
Let's copy a small portion of Application data from OCP's ES to External ES, as for example, app-000001 index, to a file.
- We will need at least analyzer, mapping and data structure information.
Perform a Backup
In this example, we will show you how to backup data types Analyzer, Mapping and Data.
Note: Available index data types:
- settings
- analyzer
- data
- mapping
- policy
- alias
- template
- component_template
- index_template
Analyzer Backup
[root@kibana ~]# NODE_TLS_REJECT_UNAUTHORIZED=0 /root/node_modules/elasticdump/bin/elasticdump --headers='{"authorization": "Bearer sha256~0JtosvhtA7YwbTx-UPlhRknHLydgP1Iov2YIVN_duCw"}' --input=https://${routeES}/app-000001 --output=analyzer.json --type=analyzer
If everything has been set up correctly, you will see messages similar to these:
Wed, 23 Mar 2022 19:56:35 GMT | starting dump
Wed, 23 Mar 2022 19:56:35 GMT | got 1 objects from source elasticsearch (offset: 0)
Wed, 23 Mar 2022 19:56:35 GMT | sent 1 objects to destination file, wrote 1
Wed, 23 Mar 2022 19:56:35 GMT | got 0 objects from source elasticsearch (offset: 1)
Wed, 23 Mar 2022 19:56:35 GMT | Total Writes: 1
Wed, 23 Mar 2022 19:56:35 GMT | dump complete
Mapping Backup
[root@kibana ~]# NODE_TLS_REJECT_UNAUTHORIZED=0 /root/node_modules/elasticdump/bin/elasticdump --headers='{"authorization": "Bearer sha256~0JtosvhtA7YwbTx-UPlhRknHLydgP1Iov2YIVN_duCw"}' --input=https://${routeES}/app-000001 --output=mapping.json --type=mapping
If everything has been set up correctly, you will see messages similar to these:
Wed, 23 Mar 2022 20:00:12 GMT | starting dump
Wed, 23 Mar 2022 20:00:12 GMT | got 1 objects from source elasticsearch (offset: 0)
Wed, 23 Mar 2022 20:00:12 GMT | sent 1 objects to destination file, wrote 1
Wed, 23 Mar 2022 20:00:12 GMT | got 0 objects from source elasticsearch (offset: 1)
Wed, 23 Mar 2022 20:00:12 GMT | Total Writes: 1
Wed, 23 Mar 2022 20:00:12 GMT | dump complete
Data Backup
[root@kibana ~]# NODE_TLS_REJECT_UNAUTHORIZED=0 /root/node_modules/elasticdump/bin/elasticdump --headers='{"authorization": "Bearer sha256~0JtosvhtA7YwbTx-UPlhRknHLydgP1Iov2YIVN_duCw"}' --input=https://${routeES}/app-000001 --output=data.json --type=data
If everything has been set up correctly, you will see messages similar to these:
Wed, 23 Mar 2022 20:00:43 GMT | starting dump
Wed, 23 Mar 2022 20:00:43 GMT | got 0 objects from source elasticsearch (offset: 0)
Wed, 23 Mar 2022 20:00:43 GMT | Total Writes: 0
Wed, 23 Mar 2022 20:00:43 GMT | dump complete
Direct Copy
In this step we will do an inline data copy directly from internal OCP's ES to External ES. Let's assume we want to copy the entire app-write index to a newly deployed cluster, or we need to update missing data on the external ES Cluster.
TIP: If you need to copy specific indexes, generate a list and copy them individually. To generate a list of index names, run the following command:
# INDEXLIST=$(curl -s -k -XGET -tlsv1.2 --insecure -H "Authorization: Bearer ${token}" "https://${routeES}/_aliases?pretty" | grep app- | cut -d ":" -f1|grep -v app-write | tr -d "\"")
# echo $INDEXLIST
The app-write index has the following attributes:
- settings
- mappings
- data
Attention: You will need to check your index attributes to be able to do a proper copy.
- In our example, all types have been included to make the process easier.
To perform a direct data copy from OCP's ES to External ES:
[root@kibana ~]# NODE_TLS_REJECT_UNAUTHORIZED=0 /root/node_modules/elasticdump/bin/elasticdump --headers='{"authorization": "Bearer sha256~0JtosvhtA7YwbTx-UPlhRknHLydgP1Iov2YIVN_duCw"}' --input=https://${routeES}/app-write --output=http://es.rhbrlabs.com:9202/ --includeType settings,analyzer,data,mapping,policy,alias,template,component_template,index_template
You should see messages like these, indicating that the copy is working:
(...)
Thu, 24 Mar 2022 00:29:07 GMT | sent 100 objects to destination elasticsearch, wrote 100
Thu, 24 Mar 2022 00:29:07 GMT | got 100 objects from source elasticsearch (offset: 568700)
Thu, 24 Mar 2022 00:29:07 GMT | sent 100 objects to destination elasticsearch, wrote 100
Thu, 24 Mar 2022 00:29:07 GMT | got 100 objects from source elasticsearch (offset: 568800)
Thu, 24 Mar 2022 00:29:08 GMT | sent 100 objects to destination elasticsearch, wrote 100
Thu, 24 Mar 2022 00:29:08 GMT | got 100 objects from source elasticsearch (offset: 568900)
(...)
At the end of the copy, we can see the updated data on the External ES cluster.
Updated Data on External ES Cluster
- IMPORTANT: When backing up very large indexes that take hours to store, I recommend using TMUX to prevent the SSH session from disconnecting and interrupting the copy.
TIP: To select a specific time range, filters can be used. Look at the examples below.
Example 1:
--searchBody '{"query":{"bool":{"must":[{"range":{"@timestamp":{"from":"2022-03-16","to":"2022-03-21"}}}],"filter":[{"match_all":{}}],"should":[],"must_not":[]}}}'
Example 2:
--searchBody '{"query":{"bool":{"must":[{"range":{"@timestamp":{"gte":"now-1d/d","lte":"now/d"}}}],"filter":[{"match_all":{}}],"should":[],"must_not":[]}}}'
Parallel Backups
This modality executes parallel processes in order to speed up a backup process. If not specified, the number of parallel processes will be automatically defined by the number of CPU cores available.
- This backup mode cannot be used for an inline copy between two ES clusters.
In the following example, we will do a parallel backup of the app-infra index:
[root@kibana ~]# NODE_TLS_REJECT_UNAUTHORIZED=0 /root/node_modules/elasticdump/bin/multielasticdump --headers='{"authorization": "Bearer sha256~0JtosvhtA7YwbTx-UPlhRknHLydgP1Iov2YIVN_duCw"}' --input=https://${routeES}/app-infra --output=/destination/backup --includeType settings,analyzer,data,mapping,policy,alias,template,component_template,index_template
Finishing
In this article, we could see how easy it is to copy the data stored in OpenShift Elasticsearch to another cluster and to perform backup dumps from the Elasticsearch database.
As a last recommendation, we suggest adopting security measures to control access to the route created for the OCP's internal ES.
To learn more about the benefits of OpenShift training, click here.
Useful Links
Categories