Curator Continue on if Index is Open Restore

This tutorial on taking Elasticsearch snapshots using curator will be divided into sections. One obvious section is how to take snapshots. Other less obvious part will be on configuring a shared directory using Network file sharing on Linux. I will be using a RHEL 7 based cluster of three machines for this tutorial. Once you are done with the basics I outline here, you should start using curator to manage your aliases as my next post details.

As usual I will start with WHY followed by HOW.

WHY

You want to take backups. If you are running a ELK stack then sooner or later you will have old logs which you want to archive and free up space on your cluster. When you upgrade your cluster then you have to take snapshots before doing anything. And there is always a that hardware failure scenario.

HOW

You can take Elasticsearch snapshots in many ways. Simplest is via curl commands. But it is better to use the tool given by Elastic. It is called …..Drum rolls please.
Elasticsearch snapshots using curator

Steps to install curator on a RHEL/CentOS machine
Some housekeeping work. Since Elasticsearch is evolving rapidly you should check the latest instructions here.

sudo rpm -- import https : //packages.elastic.co/GPG-KEY-elasticsearch

Create a curator.repo file

sudo vi / etc / yum . repos . d / curator . repo

and put this in content in it

[ curator - 5 ]

name = CentOS / RHEL 7 repository for Elasticsearch Curator 5.x packages

baseurl = http : //packages.elastic.co/curator/5/centos/7

gpgcheck = 1

gpgkey = http : //packages.elastic.co/GPG-KEY-elasticsearch

enabled = 1

Actual installation

sudo yum install elasticsearch - curator

Contrary to other tools it does not have a config file already prepared and ready for you to change.

Elasticsearch snapshots using curator
So you have to create a config file curator.yml yourself. A great starting template is located here. Just change the hosts and the port and you will be good to go. In case the curator is running on same machine as the elasticsearch is running you really can use this file as it is.
To make like easier store this file in ~/.curator location. Otherwise you have to pass the file location using –config option every time you run the tool to take Elasticsearch snapshots. And who wants to do that? Not me.

So create a directory

Create a file curator.yml in it.

Put this into the file.

# Remember, leave a key empty if there is no value. None will be a string,

## not a Python "NoneType"

client :

hosts : < your machine ip address >

port : < port number >

url_prefix :

use_ssl : False

certificate :

client_cert :

client_key :

ssl_no_validate : False

http_auth :

timeout : 30

master_only : False

logging :

loglevel : INFO

logfile : / home / elastic / logs

logformat : default

blacklist : [ 'elasticsearch' , 'urllib3' ]

For dumping the curator logs you need to have a folder. Hence the /home/elastic/logs folder which you see in the text above.

Create the folder and give the necessary permissions. Though I am logged in as elastic user (belonging to a group elk) I am doing this explicitly. The most common problem during setting up of Elasticsearch snapshots is permissions on folders. Hence making it very visible here.

mkdir - p / home / elastic / logs

cd / home / elastic

sudo chown - R elastic : elk logs

Now check if everything is working fine.

Once things are working fine then its time to do something useful. Curator can do a variety of tasks on your indices. These are called Actions and the full list is here.
You pass curator the actions via an action file. You need to pass the location of the file at command line. One nice thing about the tool is the dry-run option which allows you to do a test run safely without actually changing anything in the cluster.
You can see a whole list of sample action files doing the stuff at this location.

I will pick up the snapshot action file and change it to suit my needs. Then I can take a snapshot of a given series of indices. Elasticsearch snapshots … here I come.

elasticsearch snapshot using curator

Tedious networking stuff
You have to create a shared directory for the nodes first. And this is the hard part where lot of networking issues can trip you. The short story is that you need a location which is visible "all" the nodes in the cluster. And these nodes should have read/write permissions on that shared location. Can't stress this point enough. The idea is that when snapshot command is issued then all nodes start dumping the data from the part of index on them on to the shared location. If you already have this sorted out then skip this section by clicking here.

I will use Network File Service on RHEL 7 to create a shared directory. Then I will create a folder on each node. Folder name and path will be same. And I will mount this shared directory on each node at that particular folder.

Installing the needed softwares (I will keep this brief. I used this site. In case anything fails refer to this one or google).

On the machine where the NFS server will be running and the shared folder will be located.

sudo yum install nfs - utils libnfsidmap

sudo systemctl enable rpcbind

sudo systemctl enable nfs - server

sudo systemctl start rpcbind

sudo systemctl start nfs - server

sudo systemctl start rpc - statd

sudo systemctl start nfs - idmapd

Now you create the directory to share with the clients.

mkdir / home / elastic / backups

sudo chown - R elastic : elk / home / elastic / backups

Modify /etc/exports

and put this in it.

/ home / elastic / backups * ( rw , sync , no_root_squash )

Then export the shared directory.

On the cluster nodes some installs and configuration is needed too.
Let us create a folder on each of the nodes.

mkdir / home / elastic / mount / backups

sudo chown - R elastic : elk / home / elastic / mount / backups

We will mount the shared directory at this location.

NFS related software installs

sudo yum - y install nfs - utils libnfsidmap

sudo systemctl enable rpcbind

sudo systemctl start rpcbind

Check if the exported dir is visible on the client

showmount - e ServerHostingNFS

This should show

Export list for ServerHostingNFS :

/ home / elastic / backups *

You want the mounting of this shared directory to happen automatically on the clients when the reboot happens because reboots happen.
Open /etc/fstab

and add line

ServerHostingNFS : / home / elastic / backups / home / elastic / mount / backups nfs defaults 0 0

You want to check if the auto mount is happening. One option is to reboot the machine and then check it. But if you have any syntax mistakes in the /etc/fstab then there are chances that machine might not boot up. This can be an issue when the machine is a remote machine. So it is better to unmount the shared folder on the client machine and then just do a mount. The shared folder should get mounted automatically.

sudo umount / home / elastic / mount / backups

sudo mount - a

mount

You should see that the shared directory is mounted.

ServerHostingNFS : / home / elastic / backups on / home / elastic / mount / backups type nfs4 . . . . . blah blah

Check if the mount is writeable after the automount

touch / home / elastic / mount / backups / test

You should be able to see the file across all the nodes and inside the shared directory of on the ServerHostingNFS. Try out create and delete combinations to find if there are nay permission issues. Reboot the nodes and see if the automounting is happening.

Once you have everything on shared directory sorted out rest of stuff is actually easy.

Configuring Elasticsearch
You have to add an entry in the elasticsearch.yml
Open the file

sudo vim / etc / elasticsearch / elasticsearch . yml

and add

path . repo : [ "/home/elastic/mount/backups" ]

Then restart each elasticsearch on each of the node.
If you have any issues then elasticsearch will refuse to start up. Go through the logs to find the issue. Most of the times it is because of the permissions.

Now you have to create a repository in elasticsearch and map it to the location where the shared files service is mounted.
Use curl command in linux terminal

curl - XPUT 'yourelasticserverip:9200/_snapshot/logs_backup' - H 'Content-Type: application/json' - d '{ "type": "fs", "settings": {"location": "/home/elastic/mount/backups","compress": true}}'

Now in elasticsearch I have registered a repository which has name logs_backup. All the nodes will dump the data to /Data/mount/backups which actually refers to Data/backups shared file system.

Now we have to have an action file for curator to work with. Let us call it action_snapshot.yml and put this content in it.

actions :

1 :

action : snapshot

description : > -

Snapshot log - production - prefixed indices older than 1 day ( based on index

creation_date ) with the default snapshot name pattern of

'curator-%Y%m%d%H%M%S' . Wait for the snapshot to complete . Do not skip

the repository filesystem access check . Use the other options to create

the snapshot .

options :

repository : logs_backup

# Leaving name blank will result in the default 'curator-%Y%m%d%H%M%S'

name : ProductionLogs - % Y % m % d % H % M % S

ignore_unavailable : False

include_global_state : True

partial : False

wait_for_completion : True

skip_repo_fs_check : False

disable_action : False

filters :

- filtertype : pattern

kind : prefix

value : log - production -

- filtertype : age

source : creation_date

direction : older

unit : days

unit_count : 1

Line 11: We are passing the name of the repository we had registered earlier.
Line 14: This is the name of the snapshot. See how it will be appended with time information.
Line 24: The indices it will match. Here it will pick all the indices which start with log-production.

Now you can take Elasticsearch snapshots !! If you are a sane person you will like to do a dry run. Not me.

curator action_snapshot . yml

Output is something like this

2017 - 08 - 02 17 : 08 : 13 , 897 INFO Preparing Action ID : 1 , "snapshot"

2017 - 08 - 02 17 : 08 : 13 , 905 INFO Trying Action ID : 1 , "snapshot" : Snapshot log - production - prefixed indices older than 1 day ( based on index creation_date ) with the default snapshot name pattern of 'curator-%Y%m%d%H%M%S' . Wait for the snapshot to complete . Do not skip the repository filesystem access check . Use the other options to create the snapshot .

2017 - 08 - 02 17 : 08 : 13 , 991 INFO Creating snapshot "ProductionLogs-20170802070813" from indices : [ 'log-production-2017.06' , 'log-production-2017.07' ]

2017 - 08 - 02 17 : 08 : 14 , 049 INFO Snapshot ProductionLogs - 20170802070813 still in progress .

2017 - 08 - 02 17 : 08 : 23 , 061 INFO Snapshot ProductionLogs - 20170802070813 still in progress .

2017 - 08 - 02 17 : 08 : 32 , 072 INFO Snapshot ProductionLogs - 20170802070813 still in progress .

2017 - 08 - 02 17 : 08 : 41 , 103 INFO Snapshot ProductionLogs - 20170802070813 successfully completed .

2017 - 08 - 02 17 : 08 : 41 , 104 INFO Action ID : 1 , "snapshot" completed .

2017 - 08 - 02 17 : 08 : 41 , 104 INFO Job completed .

Go and take a peek in the shared location, "Data/backups" in our case. They should have the backup files.
You can also issue a command on the terminal to see the snapshots.

curl - XGET 'http://yourserver:9200/_snapshot/logs_backup/_all?pretty'

Output is something like this

{

"snapshots" : [

{

"snapshot" : "ProductionLogs-20170802070813" ,

"uuid" : "bWjLfMTaSgWkbWTbxL1XTA" ,

"version_id" : 5020299 ,

"version" : "5.2.2" ,

"indices" : [

"log-production-2017.07" ,

"log-production-2017.06"

. . . . blah . . . . .

}

]

}

Now with that done only one thing is left. Do a restore using Elasticsearch snapshots you have taken. How you do it is something you have to decide. For me it is simple. Since I am working with test data I will count the number of documents in the indices whose snapshot was taken. Then I will delete the indices. Then restore. And if the count of document matches with the intial one I know that restore worked.

Count of intial docs

curl - XGET 'http://yourserver:9200/log-production-*/_stats?pretty'

Output

{

"_shards" : {

"total" : 20 ,

"successful" : 20 ,

"failed" : 0

} ,

"_all" : {

"primaries" : {

"docs" : {

"count" : 5368390 ,

"deleted" : 0

} ,

"store" : {

"size_in_bytes" : 1496195150 ,

"throttle_time_in_millis" : 0

. . . . . . .

Document count is 5368390

Then a delete

curl - XDELETE 'http://yourserver:9200/log-production-*?pretty'

To restore you need an action file.
I will create an action file "action_snapshot_restore.yml"

actions :

1 :

action : restore

description : > -

Restore all indices in the most recent curator - * snapshot with state

SUCCESS . Wait for the restore to complete before continuing . Do not skip

the repository filesystem access check . Use the other options to define

the index / shard settings for the restore .

options :

repository : logs_backup

# If name is blank, the most recent snapshot by age will be selected

name : ProductionLogs - 20170803003417

# If indices is blank, all indices in the snapshot will be restored

indices :

include_aliases : False

ignore_unavailable : False

include_global_state : False

partial : False

rename_pattern :

rename_replacement :

extra_settings :

wait_for_completion : True

skip_repo_fs_check : True

disable_action : False

filters :

- filtertype : pattern

kind : prefix

value : ProductionLogs -

- filtertype : state

state : SUCCESS

Line 28: I want to work with the Elasticsearch snapshots with name beginning with ProductionLogs.
Line 10: I specify the repository to be used.
Line 12: I choose the snapshot. This is useful if you want to restore the indices only till a point in the past. To restore the indices till present leave it blank. Elasticsearch will use the latest snapshot.

System.Threading.Timer Will it work
Time to push the button.

curator action_snapshot_restore . yml

Output

2017 - 08 - 03 12 : 54 : 33 , 491 INFO Preparing Action ID : 1 , "restore"

2017 - 08 - 03 12 : 54 : 33 , 499 INFO Trying Action ID : 1 , "restore" : Restore all indices in the most recent curator - * snapshot with state SUCCESS . Wait for the restore to complete before continuing . Do not skip the repository filesystem access check . Use the other options to define the index / shard settings for the restore .

2017 - 08 - 03 12 : 54 : 33 , 514 INFO Restoring indices "['log-production-2017.07', 'log-production-2017.06']" from snapshot : ProductionLogs - 20170803003417

2017 - 08 - 03 12 : 54 : 33 , 586 INFO _recovery returned an empty response . Trying again .

2017 - 08 - 03 12 : 54 : 42 , 611 INFO Index "log-production-2017.07" is still in stage "INDEX"

2017 - 08 - 03 12 : 54 : 51 , 630 INFO Index "log-production-2017.07" is still in stage "INDEX"

2017 - 08 - 03 12 : 55 : 00 , 646 INFO Index "log-production-2017.07" is still in stage "INDEX"

2017 - 08 - 03 12 : 55 : 09 , 664 INFO Index "log-production-2017.07" is still in stage "INDEX"

2017 - 08 - 03 12 : 55 : 18 , 674 INFO Action ID : 1 , "restore" completed .

2017 - 08 - 03 12 : 55 : 18 , 674 INFO Job completed .

A quick curl command to check if Elasticsearch snapshots restore worked. See the count of documents restored.

curl - XGET 'http://yourserver:9200/log-galveston-*/_stats?pretty'

Output

{

"_shards" : {

"total" : 20 ,

"successful" : 20 ,

"failed" : 0

} ,

"_all" : {

"primaries" : {

"docs" : {

"count" : 5368390 ,

"deleted" : 0

} ,

"store" : {

"size_in_bytes" : 1496195150 ,

"throttle_time_in_millis" : 0

} ,

. . . . . .

The count is spot on. You tamed the Elasticsearch snapshots. Now you are ready to take your curator skills to next level. Start managing your aliases with curator.

jonesandla1992.blogspot.com

Source: https://ikeptwalking.com/taking-elasticsearch-snapshots-using-curator/

Curator Continue on if Index is Open Restore

WHY

HOW

0 Response to "Curator Continue on if Index is Open Restore"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel