Quick and easy docker setup of Elasticsearch and Kibana for development and research

8 min readAug 9, 2020

According to Wikipedia, “Elasticsearch is a search engine based on the Lucene library. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents.” According to me, Elasticsearch is a super fast, easy to use, open-source search engine for any kind of data. Whatever data and complex query structure we throw at Elasticsearch, it will return with the result fast and correct(very important). This can be any kind of data like server metrics, application data, images of cats ( it’s doable and there is something relation between cat and internet), etc, which makes it a good technology to learn for a varied set of people including data scientist/engineer, DevOps Engineer, software engineer, etc. Supplementing Elasticsearch is a software called Kibana(both developed by same company) which is used to visualize Elasticsearch data, easy to use UI to form Elasticsearch queries along with many other functionalities.

There are a lot of good tutorials about setting up Elasticsearch in production environment, but when it comes to setting up Elasticsearch for development or initial research purpose, none meaningful and updated while being simple enough for a beginner to understand. That’s why I decided to create this kind of tutorial to easily setup Elasticsearch and Kibana combination. The most convenient way to setup Elasticsearch is by using docker. We will setup Elasticsearch version 7.2.0 and Kibana version 7.2.0.

Prerequisites

Docker version 19.03.6-ce or above and docker-compose version 1.24.1 or above should be installed. This setup was tested in CentOS and Ubuntu servers (The same can be used in Mac and Windows systems with some modifications)

TL;DR

Create a file called elasticsearch.yaml and add the following content to that—

elasticsearch.yaml file for Elasticsearch and Kibana

Run the following commands (sudo access is required. Also there should not be a folder called ./data/elasticsearch already present in the directory from where the commands will be executed)—

Commands to start the Elasticsearch and Kibana containers

The Elasticsearch and Kibana containers are up. Elasticsearch REST API can be accessed on port 9200 and Kibana Dashboard can be accessed on port 5601.

Setup at a glance

To setup Elasticsearch and Kibana setup, following steps are involved.

Setup Elasticsearch and Kibana startup file
Increasing the the open file limit and changing the permission of the Elasticsearch volume to be mounted
Starting Elasticsearch and Kibana containers

We will look into these steps one by one

1. Setting up Elasticsearch and Kibana startup file

First we need to setup the container startup files for Elasticsearch and Kibana. For this, we will use docker-compose and create a yaml file with the configuration. The advantage of docker-compose is ease of writing and maintenance of configuration for the services we want to use.

We will walk through the service parts for both Elasticsearch and Kibana along with other configuration and then check the final file which will be used to start the services. We will create a file elasticsearch.yaml . We will be using version 3.6 for docker-compose file. Also since Elasticsearch and Kibana will be defined as services, we will mention that tag also. Add the following content to the file —

version: '3.6' services:

First we will check our configuration for Elasticsearch . We will start a single Elasticsearch container(called node). The Elasticsearch can be used in either single node or cluster configuration. Cluster configuration is generally used when we want to use Elasticsearch in high throughput environment. Since we are just looking for basic use, we will start it as a single node. Following is the Elasticsearch configuration (spacing is important)—

    elasticsearch:
        image: docker.elastic.co/elasticsearch/elasticsearch:7.2.0
        container_name: elasticsearch
        environment:
            - "discovery.type=single-node"
            - "ES_JAVA_OPTS=-Xms1G -Xmx1G"
        ports: ['9200:9200']
        networks: ['stack']
        volumes:
            - './data/elasticsearch/:/usr/share/elasticsearch/data'
        ulimits:
            nofile:
                soft: 65535
                hard: 65535

Let’s see each component 1 by 1 -

elasticsearch — This is the service name we define in docker compose file. For each container that we need to spawn using docker-compose, we need to define the configuration and set it as a service(Refer to the documentation of docker compose for more info)
image — This is the docker image which will be consumed to start the Elasticsearch container. We will use the official Elasticsearch image.
container_name — This is the name of the container which will be started. It is a good idea to set since it is very easy to refer the container by name instead of id(a randomly generated id every time a new container is started which is not easy to remember)
environment — These are the environment configuration that we set to start the Elasticsearch service. discovery.type=single-node allows us to run Elasticsearch as a single node instead of cluster. ES_JAVA_OPTS=-Xms1G -Xmx1G allows us to set the amount of heap memory that we want to reserve for the Elasticsearch container. In this case we are setting it to 1 GB. To set it in MB, replace G with m, like ES_JAVA_OPTS=-Xms512m -Xmx512m. will set it to 512 MB. Heap memory should not be more than 50% of the available system RAM(Read here for more info). Set this value according to RAM normally available.
ports — These are ports that we want to expose to be able to access from outside the docker container. This allows us to access the Elasticsearch from other services(be it a code piece, micro service, the browser or any other place). The port 9200:9200 allows us to communicate with Elasticsearch REST API on port 9200 .
networks — Every docker container uses at least 1 docker network to communicate upon. We will create our own network called stack and run Elasticsearch and Kibana on that to help easy communication between both the containers and we do not need to perform any additional configuration for the same.
volumes (optional) — Setting volumes is optional since it is basically just mounting the Elasticsearch data on the server. This allows us to persist the data created in Elasticsearch(in case the container gets deleted or recreated by mistake which can happen easily if we are not careful). The Elasticsearch container creates data in /usr/share/elasticsearch/data which we are mounting to the directory ./data/elasticsearch . (Make sure to give the permission 1000:1000 to any folder that you are mounting(Read here). This can be done by simply running the following command with sudo user(Read here) — chown 1000:1000 ./data/elasticsearch . If the following error comes in Elasticsearch container then it is because of the folder permission issue mentioned above — “stacktrace”: [“org.elasticsearch.bootstrap.StartupException: ElasticsearchException[failed to bind service]; nested: AccessDeniedException[/usr/share/elasticsearch/data/nodes];”,)
ulimits — ulimits are the system resources limit of the current user. We will set the nofile ulimit(Read more about it here and here. This basically allows more files to be opened for read/write which is used extensively by elasticsearch when writing data). Generally setting the soft and hard value to 65535 covers most of the cases we would require for our development/research.

This completes the Elasticsearch part. Next we will configure Kibana —

    kibana:
        image: docker.elastic.co/kibana/kibana:7.2.0
        container_name: kibana
        ports: ['5601:5601']
        networks: ['stack']
        depends_on: ['elasticsearch']

Let’s check each component 1 by 1 -

kibana— Similar to Elasticsearch configuration, this is the service name we define in docker compose file.
image — This is the official Kibana image which will be consumed to start the Kibana container(Similar to Elasticsearch configuration).
container_name — Similar as Elasticsearch configuration, used to easily refer container by name.
ports —Similar as Elasticsearch configuration, these are ports that we want to expose to be able to access from outside the docker container. The port 5601:5601 allows us to access the Kibana Dashboard on port 5601.
networks — Same as Elasticsearch configuration
depends_on — This tells docker which containers to deploy before deploying this container. Since Kibana uses Elasticsearch, we will tell docker to start the Elasticsearch container before starting the Kibana container

Once Kibana container configuration is completed, the final part is defining the configuration for the docker network that we have defined for both the containers. We will create a new docker network called stack which will be used by all the containers.The network will be defined as follows —

networks:
    stack:
        name: stack

The final docker compose file will look something like this(Some parts might be different for you depending on any changes done in configuration from the ones given above) . We will name this file as elasticsearch.yaml—

elasticsearch.yaml file contents

2. Increasing the the open file limit and changing the permission of the Elasticsearch volume to be mounted

During the Elasticsearch configuration, we saw 2 parameters, volume and ulimit and saw that there are some extra steps to be performed before starting the containers ,i.e., we need to increase the open file limit as well as modify the permission of the Elasticsearch volume to be mounted.

Since we need to run all the commands as root user, we will first switch to the root user by running the command —

sudo su

To increase the open file limit, we need to add the line fs.file-max=500000 to the file /etc/sysctl.conf. Once that is done, we need to update the config in the server. We can do so by running the following the commands—

echo "fs.file-max=500000" >> /etc/sysctl.conf
sysctl -p

Modifying the permission of the Elasticsearch volume is optional depending on whether the volume config has been set for Elasticsearch service in elasticsearch.yaml file. To modify permission, we need to first create a folder using mkdir and then modify the permission using chmod (Make sure that the folder ./data/elasticsearch does not already exist)—

mkdir -p ./data/elasticsearch
chown 1000:1000 ./data/elasticsearch

3. Starting Elasticsearch and Kibana containers

Now we can start our containers. To do so, will use the up functionality of docker-compose by running the following command (Ignore the warning when the command is executed. This came since I am running docker in swarm mode. This will not come if docker is not running in swarm mode, which is the case by default) —

docker-compose -f elasticsearch.yaml up -d

Spawning containers using docker-compose -f elasticsearch.yaml up -d command

Once the containers are spawned, we can run docker ps to see our containers —

Elasticsearch and Kibana container visible in docker compose command

Once the setup is completed, we can simply access Kibana by typing the IP and Kibana port(defined in port config of Kibana) in the browser URL section and voila , we get the Kibana dashboard—

Our Elasticsearch and Kibana setup is completed. Now we are free to add our data in Elasticsearch and execute powerful and complex search queries and get the data as quick as snap of a finger(not literally but you get the gist). We can also create many beautiful dashboards in Kibana to visualize our data. You can also directly access Elasticsearch data from chrome browser using the ElasticSearch Head chrome extension.

As always, if you have any doubts, let me know. Also you can check out my website genericfornow.com to connect with me. Let me know what else would you love to learn. And as always, “Be generic, be awesome”.