elasticsearch using kubernetes
Deploying elasticsearch using kubernetes
In this article, I would like to provide an example of using StatefulSet
to deploy an elasticsearch cluster.
The configuration for this setup requires
- An headless service (for intra-node communication)
- A LoadBalancer service (for providing REST endpoint to outside world) using Client Nodes only.
- A
StatefulSet
for Master node(s). - A
StatefulSet
for Data Nodes. - A
StatefulSet
for Client Nodes.
Also, one needs to note that, ElasticSearch 7.x has made some major changes to elasticsearch.yaml
w.r.t cluster configuration.
Lets break this configuration into three steps based on the above description.
Each of these steps can be executed using kubectl apply -f filename.yaml
. In order to create a stable cluster, its recommended to run Step 2 and Step 3 after Step 1 is complete.
Step 1: Setup headless service and Master node cluster.
Some highlights from the config below,
(1) We add an headless service and make sure to use this service as part of the StatefulSet, so that ordinal indexes
are used properly to name the node(s).
(2) Also, we use the trick of initContainers
to fix the Pod with prerequisites required by ElasticSearch.
(3) We use OpenSource version of ElasticSearch 7.6.1.
As per ElasticSearch docs, (https://www.elastic.co/guide/en/elasticsearch/reference/current/discovery-settings.html)
The initial master nodes should be identified by their
node.name
, which defaults to their hostname. Make sure that the value incluster.initial_master_nodes
matches thenode.name
exactly. If you use a fully-qualified domain name such asmaster-node-a.example.com
for your node names then you must use the fully-qualified name in this list; conversely ifnode.name
is a bare hostname without any trailing qualifiers then you must also omit the trailing qualifiers incluster.initial_master_nodes
.
(4) Its crucial that node.name
and cluster.initial_master_nodes
names match correctly.
(5) One can set the replicas to be 3 for a more resilient cluster. Also, providing the names of other master nodes is optional.
Note: In order to be able to keep a production cluster, one needs to setup a 3-node master. For testing purposes, a single node master may be sufficient.
Step 2: Setup LoadBalancer and Client Node
Some highlights from the config below,
(1) The service is able to get hold of all the client nodes using the labels
into the LoadBalancer.
Step 3: Setup DataNode
Some highlights from the config below,
(1) The data node requires Storage Volumes to be setup so that, all the indexes and documents created are persisted and can withstand pod / node failures.
(2) It is to be noted that, this statefulset can be deleted and added back, yet the indexes and documents would be preserved as the corresponding PVC and PV are not deleted by default.
⛑ Suggestions / Feedback ! 😃