Configure and docker compose in 5 minutes for docker images
Compose is a tool for defining and running multi-container Docker applications. With Compose, you use a YAML file to configure your application’s services.
Whenever one has to run more than a container and have them to communicate with another,
Quick yet gentle introduction in get Kafka running
In this article I am using Kafka 2.8.0 for client and server. Hence one may notice some discrepancy with the use of Zookeeper. This is due to
KIP-500 to replace Zookeeper with self-managed quorum.
The best way to introduce Kafka is by…
Many times one would have a need to template-ize a volume and would like to clone it for every new pod creation. ‘Volume Snapshot’ is the name for this.
In this article I have put together the steps required to
Customize Apache Spark 3.1.1 to work with S3 / GCS
Apache Spark 3.0.1 can be built from source code along with
(1) AWS specific binaries to enable reading and writing to
(2) GCP specific binaries to enable reading and writing to
(3) Azure — at the time of…
A developer’s guide to setting up Vault in kubernetes and using it with kv-store for secrets and userpass access.
In this brief write-up, I shall try to provide a quick way to get Vault up and running from a running GKE cluster.
A closure is the combination of a function and the lexical environment within which that function was declared.
The reason it is called a “closure” is that an expression containing free variables is called an “open” expression, and by associating to it the bindings of its free variables, you…
Deploying elasticsearch using kubernetes
In this article, I would like to provide an example of using
StatefulSet to deploy an elasticsearch cluster.
The configuration for this setup requires
Understanding how storage works
Kubernetes, a container orchestration engine had been built for stateless systems. These are generally the kinds of applications we commonly build.
deployment configuration for applications does help with this effectively. But, there may be cases where one wants to preserve state in a pod.
Configure Apache Spark with Kubernetes
Many people like to use k8 for the clustering and scaling capabilities. And many other people like to use Apache Spark for big data processing in a cluster.
HDFS as dfs deployed to local for development and testing.
The Hadoop Distributed File System (HDFS) is the primary data storage system used by Hadoop applications. It employs a NameNode and DataNode architecture to implement a distributed file system that provides high-performance access to data across highly scalable Hadoop clusters.