40

I use the official elasticsearch docker image and wonder how can I include also during building a custom index, so that the index is already there when I start the container.

My attempt was to add the following line to my dockerfile:

RUN curl -XPUT 'http://127.0.0.1:9200/myindex' -d @index.json

I get the following error:

0curl: (7) Failed to connect to 127.0.0.1 port 9200: Connection refused

Can I reach elasticsearch during build with such an API call or is there a complete different way to implement that?

crisscross
  • 1,675
  • 2
  • 18
  • 28

3 Answers3

34

I've had a similar problem.

I wanted to create a docker container with preloaded data (via some scripts and json files in the repo). The data inside elasticsearch was not going to change during the execution and I wanted as few build steps as possible (ideally only docker-compose up -d).

One option would be to do it manually once, and store the elasticsearch data folder (with a docker volume) in the repository. But then I would have had duplicate data and I would have to check in manually a new version of the data folder every time the data changes.

The solution

  1. Make elasticsearch write data to a folder that is not declared as a volume in elasticsearchs' official dockerfile.

RUN mkdir /data && chown -R elasticsearch:elasticsearch /data && echo 'es.path.data: /data' >> config/elasticsearch.yml && echo 'path.data: /data' >> config/elasticsearch.yml

(the folder needs to be created with the right permissions)

  1. Download wait-for-it

ADD https://raw.githubusercontent.com/vishnubob/wait-for-it/e1f115e4ca285c3c24e847c4dd4be955e0ed51c2/wait-for-it.sh /utils/wait-for-it.sh

This script will wait until elasticsearch is up to run our insert commands.

  1. Insert data into elasticsearch

RUN /docker-entrypoint.sh elasticsearch -p /tmp/epid & /bin/bash /utils/wait-for-it.sh -t 0 localhost:9200 -- path/to/insert/script.sh; kill $(cat /tmp/epid) && wait $(cat /tmp/epid); exit 0;

This command starts elasticsearch during the build process, inserts data and takes it down in one RUN command. The container is left as it was except for elasticsearch's data folder which has been properly initialized now.

Summary

FROM elasticsearch

RUN mkdir /data && chown -R elasticsearch:elasticsearch /data && echo 'es.path.data: /data' >> config/elasticsearch.yml && echo 'path.data: /data' >> config/elasticsearch.yml

ADD https://raw.githubusercontent.com/vishnubob/wait-for-it/e1f115e4ca285c3c24e847c4dd4be955e0ed51c2/wait-for-it.sh /utils/wait-for-it.sh

# Copy the files you may need and your insert script

RUN /docker-entrypoint.sh elasticsearch -p /tmp/epid & /bin/bash /utils/wait-for-it.sh -t 0 localhost:9200 -- path/to/insert/script.sh; kill $(cat /tmp/epid) && wait $(cat /tmp/epid); exit 0;

And that's it! When you run this image, the database will have preloaded data, indexes, etc...

Erpheus
  • 826
  • 7
  • 17
  • 1
    btw in this specific case, step 3 of my solution would be: ```RUN /docker-entrypoint.sh elasticsearch -p /tmp/epid & /utils/wait-for-it.sh -t 0 localhost:9200 -- curl -XPUT 'http://127.0.0.1:9200/myindex' -d @index.json; kill $(cat /tmp/epid) && wait $(cat /tmp/epid); exit 0;``` No need for an extra script, although it is kind of ugly to have the whole command in there. – Erpheus Oct 05 '16 at 11:48
  • 1
    Curious how this would be implemented in current day? Also, is the 'docker-entrypoint.sh' script you referred to in your answer from the es-docker source? – Bryce Sep 07 '17 at 16:02
  • 2
    @Bryce I don't think it has gotten any easier since the original answer. Still that doesn't mean there aren't any other options. You could keep a copy of the data folder after elasticsearch insertion locally, then mount it as a volume on docker run. At least that's the best you can do if you want to run arbitrary ES commands. If all you wanted was the index you can add it as an index template with a simple ADD instruction: `ADD myindex.json /etc/elasticsearch/templates/` (haven't tried it, just guessing the command). Keep in mind this approach doesn't create the index, only define its mappings – Erpheus Sep 08 '17 at 16:15
  • 2
    @Bryce, All I see is elastic has better docker support, but I have not seen any instructions like what postgresql does with their /docker-entrypoint-initdb.d/ folder to populate the db. I would LOVE if that could happen ;-) What fatih tekin answered is along the same thought process +1 – akahunahi Nov 16 '18 at 15:25
  • wait-for-it file from master branch: https://raw.githubusercontent.com/vishnubob/wait-for-it/master/wait-for-it.sh – Damien Golding Jul 25 '23 at 10:57
14

The simple way of doing this could be using below Dockerfile.

Run this Dockerfile with docker build -t elasticsearch-custom:latest .

FROM elasticsearch:5.5.1 AS esbuilder
ADD script.sh path/to/insert/script.sh
RUN apt-get update \
    && apt-get install procps -y \
    && apt-get install httping -y \
    && /docker-entrypoint.sh elasticsearch -d -E path.data=/tmp/data \
    && while ! httping -qc1 http://localhost:9200 ; do sleep 1 ; done \
    && path/to/insert/script.sh \
    && apt-get clean

FROM elasticsearch:5.5.1
COPY --from=esbuilder /tmp/data/ /usr/share/elasticsearch/data/

And then just run docker run -t -d elasticsearch-custom:latest

fatih tekin
  • 959
  • 10
  • 21
  • 9
    Took this answer and built https://github.com/gaving/es-bundled-template with it replacing the dependencies with curl. – Gavin Gilmour May 28 '19 at 16:01
0

If I got your question, you are trying to connect with ElasticSearch instance even before its host i.e. docker container is running. You need to get your container running first.

You can create a shell script which you can execute using bash option. Something like docker run

Anuj Yadav
  • 980
  • 11
  • 20
  • Thanks for your reply. Are you sure there is no way to add the index during build. Can I not copy it into a specific folder? – crisscross Feb 20 '16 at 21:20
  • @crisscross As far as I understand, you are trying to build your docker and once the build complete you want docker to be ready with an ES instance, have it running and then create a default index. If this is the case then logically speaking a post build script looks good to me. As definitely docker container needs to be running before we do something on it. Regarding folder, you can use it in conjunction with startup shell script using bash profile. That said, post run scripts looks cleaner to me. – Anuj Yadav Feb 22 '16 at 03:40
  • Well that still doesn't answers my question how to add an index during building or when it is already running. – crisscross Feb 22 '16 at 11:46