elasticsearch using kubernetes

Programmer
2 min readMar 27, 2020

Deploying elasticsearch using kubernetes

3-Node Client, Data and Master deployment of ES

In this article, I would like to provide an example of using StatefulSet to deploy an elasticsearch cluster.

The configuration for this setup requires

  • An headless service (for intra-node communication)
  • A LoadBalancer service (for providing REST endpoint to outside world) using Client Nodes only.
  • A StatefulSet for Master node(s).
  • A StatefulSet for Data Nodes.
  • A StatefulSet for Client Nodes.

Also, one needs to note that, ElasticSearch 7.x has made some major changes to elasticsearch.yaml w.r.t cluster configuration.

Lets break this configuration into three steps based on the above description.

Each of these steps can be executed using kubectl apply -f filename.yaml . In order to create a stable cluster, its recommended to run Step 2 and Step 3 after Step 1 is complete.

Step 1: Setup headless service and Master node cluster.

Some highlights from the config below,
(1) We add an headless service and make sure to use this service as part of the StatefulSet, so that ordinal indexes are used properly to name the node(s).

(2) Also, we use the trick of initContainers to fix the Pod with prerequisites required by ElasticSearch.

(3) We use OpenSource version of ElasticSearch 7.6.1.

As per ElasticSearch docs, (https://www.elastic.co/guide/en/elasticsearch/reference/current/discovery-settings.html)

The initial master nodes should be identified by their node.name, which defaults to their hostname. Make sure that the value in cluster.initial_master_nodes matches the node.name exactly. If you use a fully-qualified domain name such as master-node-a.example.com for your node names then you must use the fully-qualified name in this list; conversely if node.name is a bare hostname without any trailing qualifiers then you must also omit the trailing qualifiers in cluster.initial_master_nodes.

(4) Its crucial that node.name and cluster.initial_master_nodes names match correctly.

(5) One can set the replicas to be 3 for a more resilient cluster. Also, providing the names of other master nodes is optional.

Note: In order to be able to keep a production cluster, one needs to setup a 3-node master. For testing purposes, a single node master may be sufficient.

Step 2: Setup LoadBalancer and Client Node

Some highlights from the config below,

(1) The service is able to get hold of all the client nodes using the labels into the LoadBalancer.

Step 3: Setup DataNode

Some highlights from the config below,

(1) The data node requires Storage Volumes to be setup so that, all the indexes and documents created are persisted and can withstand pod / node failures.

(2) It is to be noted that, this statefulset can be deleted and added back, yet the indexes and documents would be preserved as the corresponding PVC and PV are not deleted by default.

⛑ Suggestions / Feedback ! 😃

--

--