According to Wikipedia, “Elasticsearch is a search engine based on the Lucene library. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents.” According to me, Elasticsearch is a super fast, easy to use, open-source search engine for any kind of data. Whatever data and complex query structure we throw at Elasticsearch, it will return with the result fast and correct(very important). This can be any kind of data like server metrics, application data, images of cats ( it’s doable and there is something relation between cat and internet), etc, which makes it a good technology to learn for a varied set of people including data scientist/engineer, DevOps Engineer, software engineer, etc. Supplementing Elasticsearch is a software called Kibana(both developed by same company) which is used to visualize Elasticsearch data, easy to use UI to form Elasticsearch queries along with many other functionalities.
There are a lot of good tutorials about setting up Elasticsearch in production environment, but when it comes to setting up Elasticsearch for development or initial research purpose, none meaningful and updated while being simple enough for a beginner to understand. That’s why I decided to create this kind of tutorial to easily setup Elasticsearch and Kibana combination. The most convenient way to setup Elasticsearch is by using docker. We will setup Elasticsearch version 7.2.0 and Kibana version 7.2.0.
Create a file called
elasticsearch.yaml and add the following content to that—
Run the following commands (
sudo access is required. Also there should not be a folder called
./data/elasticsearch already present in the directory from where the commands will be executed)—
The Elasticsearch and Kibana containers are up. Elasticsearch REST API can be accessed on port
9200 and Kibana Dashboard can be accessed on port
Setup at a glance
To setup Elasticsearch and Kibana setup, following steps are involved.
- Setup Elasticsearch and Kibana startup file
- Increasing the the open file limit and changing the permission of the Elasticsearch volume to be mounted
- Starting Elasticsearch and Kibana containers
We will look into these steps one by one
1. Setting up Elasticsearch and Kibana startup file
First we need to setup the container startup files for Elasticsearch and Kibana. For this, we will use docker-compose and create a
yaml file with the configuration. The advantage of docker-compose is ease of writing and maintenance of configuration for the services we want to use.
We will walk through the service parts for both Elasticsearch and Kibana along with other configuration and then check the final file which will be used to start the services. We will create a file
elasticsearch.yaml . We will be using version
3.6 for docker-compose file. Also since Elasticsearch and Kibana will be defined as services, we will mention that tag also. Add the following content to the file —
version: '3.6' services:
First we will check our configuration for Elasticsearch . We will start a single Elasticsearch container(called node). The Elasticsearch can be used in either single node or cluster configuration. Cluster configuration is generally used when we want to use Elasticsearch in high throughput environment. Since we are just looking for basic use, we will start it as a single node. Following is the Elasticsearch configuration (spacing is important)—
- "ES_JAVA_OPTS=-Xms1G -Xmx1G"
Let’s see each component 1 by 1 -
elasticsearch— This is the service name we define in docker compose file. For each container that we need to spawn using docker-compose, we need to define the configuration and set it as a service(Refer to the documentation of docker compose for more info)
image— This is the docker image which will be consumed to start the Elasticsearch container. We will use the official Elasticsearch image.
container_name— This is the name of the container which will be started. It is a good idea to set since it is very easy to refer the container by name instead of id(a randomly generated id every time a new container is started which is not easy to remember)
environment— These are the environment configuration that we set to start the Elasticsearch service.
discovery.type=single-nodeallows us to run Elasticsearch as a single node instead of cluster.
ES_JAVA_OPTS=-Xms1G -Xmx1Gallows us to set the amount of heap memory that we want to reserve for the Elasticsearch container. In this case we are setting it to 1 GB. To set it in MB, replace
ES_JAVA_OPTS=-Xms512m -Xmx512m. will set it to 512 MB. Heap memory should not be more than 50% of the available system RAM(Read here for more info). Set this value according to RAM normally available.
ports— These are ports that we want to expose to be able to access from outside the docker container. This allows us to access the Elasticsearch from other services(be it a code piece, micro service, the browser or any other place). The port
9200:9200allows us to communicate with Elasticsearch REST API on port
networks— Every docker container uses at least 1 docker network to communicate upon. We will create our own network called
stackand run Elasticsearch and Kibana on that to help easy communication between both the containers and we do not need to perform any additional configuration for the same.
volumes(optional) — Setting volumes is optional since it is basically just mounting the Elasticsearch data on the server. This allows us to persist the data created in Elasticsearch(in case the container gets deleted or recreated by mistake which can happen easily if we are not careful). The Elasticsearch container creates data in
/usr/share/elasticsearch/datawhich we are mounting to the directory
./data/elasticsearch. (Make sure to give the permission
1000:1000to any folder that you are mounting(Read here). This can be done by simply running the following command with sudo user(Read here) —
chown 1000:1000 ./data/elasticsearch. If the following error comes in Elasticsearch container then it is because of the folder permission issue mentioned above —
“stacktrace”: [“org.elasticsearch.bootstrap.StartupException: ElasticsearchException[failed to bind service]; nested: AccessDeniedException[/usr/share/elasticsearch/data/nodes];”,)
ulimitsare the system resources limit of the current user. We will set the
nofileulimit(Read more about it here and here. This basically allows more files to be opened for read/write which is used extensively by elasticsearch when writing data). Generally setting the
65535covers most of the cases we would require for our development/research.
This completes the Elasticsearch part. Next we will configure Kibana —
Let’s check each component 1 by 1 -
- kibana— Similar to Elasticsearch configuration, this is the service name we define in docker compose file.
image— This is the official Kibana image which will be consumed to start the Kibana container(Similar to Elasticsearch configuration).
container_name— Similar as Elasticsearch configuration, used to easily refer container by name.
ports—Similar as Elasticsearch configuration, these are ports that we want to expose to be able to access from outside the docker container. The port
5601:5601allows us to access the Kibana Dashboard on port
networks— Same as Elasticsearch configuration
depends_on— This tells docker which containers to deploy before deploying this container. Since Kibana uses Elasticsearch, we will tell docker to start the Elasticsearch container before starting the Kibana container
Once Kibana container configuration is completed, the final part is defining the configuration for the docker network that we have defined for both the containers. We will create a new docker network called
stack which will be used by all the containers.The
network will be defined as follows —
The final docker compose file will look something like this(Some parts might be different for you depending on any changes done in configuration from the ones given above) . We will name this file as
2. Increasing the the open file limit and changing the permission of the Elasticsearch volume to be mounted
During the Elasticsearch configuration, we saw 2 parameters,
ulimit and saw that there are some extra steps to be performed before starting the containers ,i.e., we need to increase the open file limit as well as modify the permission of the Elasticsearch volume to be mounted.
Since we need to run all the commands as root user, we will first switch to the root user by running the command —
To increase the open file limit, we need to add the line
fs.file-max=500000 to the file
/etc/sysctl.conf. Once that is done, we need to update the config in the server. We can do so by running the following the commands—
echo "fs.file-max=500000" >> /etc/sysctl.conf
Modifying the permission of the Elasticsearch volume is optional depending on whether the
volume config has been set for Elasticsearch service in
elasticsearch.yaml file. To modify permission, we need to first create a folder using
mkdir and then modify the permission using
chmod (Make sure that the folder
./data/elasticsearch does not already exist)—
mkdir -p ./data/elasticsearch
chown 1000:1000 ./data/elasticsearch
3. Starting Elasticsearch and Kibana containers
Now we can start our containers. To do so, will use the
up functionality of
docker-compose by running the following command (Ignore the warning when the command is executed. This came since I am running docker in swarm mode. This will not come if docker is not running in swarm mode, which is the case by default) —
docker-compose -f elasticsearch.yaml up -d
Once the containers are spawned, we can run
docker ps to see our containers —
Once the setup is completed, we can simply access Kibana by typing the IP and Kibana port(defined in
port config of Kibana) in the browser URL section and voila , we get the Kibana dashboard—
Our Elasticsearch and Kibana setup is completed. Now we are free to add our data in Elasticsearch and execute powerful and complex search queries and get the data as quick as snap of a finger(not literally but you get the gist). We can also create many beautiful dashboards in Kibana to visualize our data. You can also directly access Elasticsearch data from chrome browser using the ElasticSearch Head chrome extension.
As always, if you have any doubts, let me know. Also you can check out my website genericfornow.com to connect with me. Let me know what else would you love to learn. And as always, “Be generic, be awesome”.