Quick and easy docker setup of Elasticsearch and Kibana for development and research
According to Wikipedia, “Elasticsearch is a search engine based on the Lucene library. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents.” According to me, Elasticsearch is a super fast, easy to use, open-source search engine for any kind of data. Whatever data and complex query structure we throw at Elasticsearch, it will return with the result fast and correct(very important). This can be any kind of data like server metrics, application data, images of cats ( it’s doable and there is something relation between cat and internet), etc, which makes it a good technology to learn for a varied set of people including data scientist/engineer, DevOps Engineer, software engineer, etc. Supplementing Elasticsearch is a software called Kibana(both developed by same company) which is used to visualize Elasticsearch data, easy to use UI to form Elasticsearch queries along with many other functionalities.
There are a lot of good tutorials about setting up Elasticsearch in production environment, but when it comes to setting up Elasticsearch for development or initial research purpose, none meaningful and updated while being simple enough for a beginner to understand. That’s why I decided to create this kind of tutorial to easily setup Elasticsearch and Kibana combination. The most convenient way to setup Elasticsearch is by using docker. We will setup Elasticsearch version 7.2.0 and Kibana version 7.2.0.
Prerequisites
Docker version 19.03.6-ce or above and docker-compose version 1.24.1 or above should be installed. This setup was tested in CentOS and Ubuntu servers (The same can be used in Mac and Windows systems with some modifications)
TL;DR
Create a file called elasticsearch.yaml
and add the following content to that—
Run the following commands (sudo
access is required. Also there should not be a folder called ./data/elasticsearch
already present in the directory from where the commands will be executed)—
The Elasticsearch and Kibana containers are up. Elasticsearch REST API can be accessed on port 9200
and Kibana Dashboard can be accessed on port 5601
.
Setup at a glance
To setup Elasticsearch and Kibana setup, following steps are involved.
- Setup Elasticsearch and Kibana startup file
- Increasing the the open file limit and changing the permission of the Elasticsearch volume to be mounted
- Starting Elasticsearch and Kibana containers
We will look into these steps one by one
1. Setting up Elasticsearch and Kibana startup file
First we need to setup the container startup files for Elasticsearch and Kibana. For this, we will use docker-compose and create a yaml
file with the configuration. The advantage of docker-compose is ease of writing and maintenance of configuration for the services we want to use.
We will walk through the service parts for both Elasticsearch and Kibana along with other configuration and then check the final file which will be used to start the services. We will create a file elasticsearch.yaml
. We will be using version 3.6
for docker-compose file. Also since Elasticsearch and Kibana will be defined as services, we will mention that tag also. Add the following content to the file —
version: '3.6' services:
First we will check our configuration for Elasticsearch . We will start a single Elasticsearch container(called node). The Elasticsearch can be used in either single node or cluster configuration. Cluster configuration is generally used when we want to use Elasticsearch in high throughput environment. Since we are just looking for basic use, we will start it as a single node. Following is the Elasticsearch configuration (spacing is important)—
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:7.2.0
container_name: elasticsearch
environment:
- "discovery.type=single-node"
- "ES_JAVA_OPTS=-Xms1G -Xmx1G"
ports: ['9200:9200']
networks: ['stack']
volumes:
- './data/elasticsearch/:/usr/share/elasticsearch/data'
ulimits:
nofile:
soft: 65535
hard: 65535
Let’s see each component 1 by 1 -
elasticsearch
— This is the service name we define in docker compose file. For each container that we need to spawn using docker-compose, we need to define the configuration and set it as a service(Refer to the documentation of docker compose for more info)image
— This is the docker image which will be consumed to start the Elasticsearch container. We will use the official Elasticsearch image.container_name
— This is the name of the container which will be started. It is a good idea to set since it is very easy to refer the container by name instead of id(a randomly generated id every time a new container is started which is not easy to remember)environment
— These are the environment configuration that we set to start the Elasticsearch service.discovery.type=single-node
allows us to run Elasticsearch as a single node instead of cluster.ES_JAVA_OPTS=-Xms1G -Xmx1G
allows us to set the amount of heap memory that we want to reserve for the Elasticsearch container. In this case we are setting it to 1 GB. To set it in MB, replaceG
withm
, likeES_JAVA_OPTS=-Xms512m -Xmx512m
. will set it to 512 MB. Heap memory should not be more than 50% of the available system RAM(Read here for more info). Set this value according to RAM normally available.ports
— These are ports that we want to expose to be able to access from outside the docker container. This allows us to access the Elasticsearch from other services(be it a code piece, micro service, the browser or any other place). The port9200:9200
allows us to communicate with Elasticsearch REST API on port9200
.networks
— Every docker container uses at least 1 docker network to communicate upon. We will create our own network calledstack
and run Elasticsearch and Kibana on that to help easy communication between both the containers and we do not need to perform any additional configuration for the same.volumes
(optional) — Setting volumes is optional since it is basically just mounting the Elasticsearch data on the server. This allows us to persist the data created in Elasticsearch(in case the container gets deleted or recreated by mistake which can happen easily if we are not careful). The Elasticsearch container creates data in/usr/share/elasticsearch/data
which we are mounting to the directory./data/elasticsearch
. (Make sure to give the permission1000:1000
to any folder that you are mounting(Read here). This can be done by simply running the following command with sudo user(Read here) —chown 1000:1000 ./data/elasticsearch
. If the following error comes in Elasticsearch container then it is because of the folder permission issue mentioned above —“stacktrace”: [“org.elasticsearch.bootstrap.StartupException: ElasticsearchException[failed to bind service]; nested: AccessDeniedException[/usr/share/elasticsearch/data/nodes];”,
)ulimits
—ulimits
are the system resources limit of the current user. We will set thenofile
ulimit(Read more about it here and here. This basically allows more files to be opened for read/write which is used extensively by elasticsearch when writing data). Generally setting thesoft
andhard
value to65535
covers most of the cases we would require for our development/research.
This completes the Elasticsearch part. Next we will configure Kibana —
kibana:
image: docker.elastic.co/kibana/kibana:7.2.0
container_name: kibana
ports: ['5601:5601']
networks: ['stack']
depends_on: ['elasticsearch']
Let’s check each component 1 by 1 -
- kibana— Similar to Elasticsearch configuration, this is the service name we define in docker compose file.
image
— This is the official Kibana image which will be consumed to start the Kibana container(Similar to Elasticsearch configuration).container_name
— Similar as Elasticsearch configuration, used to easily refer container by name.ports
—Similar as Elasticsearch configuration, these are ports that we want to expose to be able to access from outside the docker container. The port5601:5601
allows us to access the Kibana Dashboard on port5601
.networks
— Same as Elasticsearch configurationdepends_on
— This tells docker which containers to deploy before deploying this container. Since Kibana uses Elasticsearch, we will tell docker to start the Elasticsearch container before starting the Kibana container
Once Kibana container configuration is completed, the final part is defining the configuration for the docker network that we have defined for both the containers. We will create a new docker network called stack
which will be used by all the containers.The network
will be defined as follows —
networks:
stack:
name: stack
The final docker compose file will look something like this(Some parts might be different for you depending on any changes done in configuration from the ones given above) . We will name this file as elasticsearch.yaml
—
2. Increasing the the open file limit and changing the permission of the Elasticsearch volume to be mounted
During the Elasticsearch configuration, we saw 2 parameters, volume
and ulimit
and saw that there are some extra steps to be performed before starting the containers ,i.e., we need to increase the open file limit as well as modify the permission of the Elasticsearch volume to be mounted.
Since we need to run all the commands as root user, we will first switch to the root user by running the command —
sudo su
To increase the open file limit, we need to add the line fs.file-max=500000
to the file /etc/sysctl.conf
. Once that is done, we need to update the config in the server. We can do so by running the following the commands—
echo "fs.file-max=500000" >> /etc/sysctl.conf
sysctl -p
Modifying the permission of the Elasticsearch volume is optional depending on whether the volume
config has been set for Elasticsearch service in elasticsearch.yaml
file. To modify permission, we need to first create a folder using mkdir
and then modify the permission using chmod
(Make sure that the folder ./data/elasticsearch
does not already exist)—
mkdir -p ./data/elasticsearch
chown 1000:1000 ./data/elasticsearch
3. Starting Elasticsearch and Kibana containers
Now we can start our containers. To do so, will use the up
functionality of docker-compose
by running the following command (Ignore the warning when the command is executed. This came since I am running docker in swarm mode. This will not come if docker is not running in swarm mode, which is the case by default) —
docker-compose -f elasticsearch.yaml up -d
Once the containers are spawned, we can run docker ps
to see our containers —
Once the setup is completed, we can simply access Kibana by typing the IP and Kibana port(defined in port
config of Kibana) in the browser URL section and voila , we get the Kibana dashboard—
Our Elasticsearch and Kibana setup is completed. Now we are free to add our data in Elasticsearch and execute powerful and complex search queries and get the data as quick as snap of a finger(not literally but you get the gist). We can also create many beautiful dashboards in Kibana to visualize our data. You can also directly access Elasticsearch data from chrome browser using the ElasticSearch Head chrome extension.
As always, if you have any doubts, let me know. Also you can check out my website genericfornow.com to connect with me. Let me know what else would you love to learn. And as always, “Be generic, be awesome”.