4

I'm just getting to grips with Docker and docker-compose, trying to create a development environment for Elasticsearch which I will deploy later.

I've been using docker-elk as a reference, and I've managed to create a working Elasticsearch container, seed it, and use it in my project.

As I understand it, Docker containers don't persist data, unless you use the Volumes API and create a volume outside the container that the container then accesses (read that here).

However docker-elk only uses Volumes to share a config yml file, but somehow my elastic indices are persisting when I bring the container down and up again.

From the docker-elk readme:

The data stored in Elasticsearch will be persisted after container reboot but not after container removal.

Can someone please explain what part of the below configuration is allowing the docker container to persist the index?

docker-compose.yml

version: '2'

services:
  elasticsearch:
    build:
      context: build/elasticsearch/
    volumes:
      - ./build/elasticsearch/config.yml:/usr/share/elasticsearch/config/elasticsearch.yml:ro
    ports:
      - "9200:9200"
      - "9300:9300"
    environment:
      ES_JAVA_OPTS: "-Xmx256m -Xms256m"
    networks:
      - elk

networks:
  elk:
    driver: bridge

build/elasticsearch/Dockerfile

FROM docker.elastic.co/elasticsearch/elasticsearch-oss:6.0.0

build/elasticsearch/config.yml

cluster.name: "docker-cluster"
network.host: 0.0.0.0
discovery.zen.minimum_master_nodes: 1
discovery.type: single-node
daviestar
  • 4,531
  • 3
  • 29
  • 47

3 Answers3

2

As you may know, a container is a sandbox. It has a filesystem with a structure very identical to a typical linux OS. The container only sees those files and folders that are in this filesystem.

The process running inside the container writes it data and config to files in this filesystem. This process is unaware that it is running in a container or on a VM. Thus the data is persisted in files and folder in this filesystem.

Now when you remove a container using docker rm ... those files are deleted with the container and thus you lose the data unless you use volumes which backup this data on the host.

On the other hand, stopping and starting the container does not remove the container files and thus the data is still there when you restart the container.

yamenk
  • 46,736
  • 10
  • 93
  • 87
  • Thanks for the answer, what's still confusing me is when I use the Dockerfile directly, and using `docker build -t elasticsearch .` and `docker run -p 9200:9200 -p 9300:9300 elasticsearch`, the indices were not persisting when rebooting the container, so what has changed using docker-compose? – daviestar Dec 06 '17 at 13:57
  • @daviestar What commands are u using to reboot the container? – yamenk Dec 06 '17 at 13:59
  • 1
    just the 2nd command - `docker run -p 9200:9200 -p 9300:9300 elasticsearch` – daviestar Dec 06 '17 at 13:59
  • This is not a container reboot. You are creating a completely new container with it own filesystem everytime you run this command. A reboot is `docker stop ` followed by `docker start ` – yamenk Dec 06 '17 at 14:01
  • @daviestar When you run `docker-compose up` it doesn't create a new container if it finds one already existing from a previous `docker-compose up` command on the same compose file. This is what is keeping the indices "persisted". – yamenk Dec 06 '17 at 14:09
1

To supplement the accepted answer, for anyone who is looking for how to persist the data. Add a volume as mentioned in the question.

version: '3'

services:
  
  elasticsearch: # Elasticsearch Instance
    container_name: es-search
    image: docker.elastic.co/elasticsearch/elasticsearch:6.1.1
    volumes: # Persist ES data in seperate "esdata" volume
      - esdata:/usr/share/elasticsearch/data
    environment:
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
      - discovery.type=single-node
    ports: # Expose Elasticsearch ports
      - "9300:9300"
      - "9200:9200"

volumes: # Define seperate volume for Elasticsearch data
  esdata: ./my/esdata # path of your persisted data here

I found a guide for elastic docker here: https://blog.patricktriest.com/text-search-docker-elasticsearch/

cYee
  • 1,915
  • 1
  • 16
  • 24
-2

One can observe the index and their mapping in UUID using the below command.

curl 'localhost:9200/_cat/indices?v'
Gama11
  • 31,714
  • 9
  • 78
  • 100