27

I'm running a dockerized mongo container.

I'd like to create a mongo image with some initialized data.

Any ideas?

Jordi
  • 20,868
  • 39
  • 149
  • 333
  • I've absolutly no idea. As far I've been able to figure out I could try with a dockerized shell `sudo docker exec -it mongo mongo`. However, each time I need to dump all data again. – Jordi Sep 06 '16 at 12:03
  • You need to start a previously created container to get the same data as before. Otherwise docker creates a new container with a fresh database, see [this answer](http://stackoverflow.com/questions/39289394/mysql-databases-are-gone-when-the-docker-container-is-shutdown/39289787#39289787) – n2o Sep 06 '16 at 12:05
  • 1
    There is a good discussion of this, with a suggestion of running mongoimport in a disposable container to seed your database, at http://stackoverflow.com/a/33397913/174843 – Vince Bowdren Sep 06 '16 at 13:42
  • @VinceBowdren I wouldn't recommend that technique. It copies seed data into the image (`COPY init.json /init.json`). No confidential data should go into an image (including passwords, certs, database data). – Bernard Sep 06 '16 at 13:49

5 Answers5

43

A more self-contained approach:

  • create javascript files that initialize your database
  • create a derived MongoDB docker image that contains these files

There are many answers that use disposable containers or create volumes and link them, but this seems overly complicated. If you take a look at the mongo docker image's docker-entrypoint.sh, you see that line 206 executes /docker-entrypoint-initdb.d/*.js files on initialization using a syntax: mongo <db> <js-file>. If you create a derived MongoDB docker image that contains your seed data, you can:

  • have a single docker run command that stands up a mongo with seed data
  • have data is persisted through container stops and starts
  • reset that data with docker stop, rm, and run commands
  • easily deploy with runtime schedulers like k8s, mesos, swarm, rancher

This approach is especially well suited to:

  • POCs that just need some realistic data for display
  • CI/CD pipelines that need consistent data for black box testing
  • example deployments for product demos (sales engineers, product owners)

How to:

  1. Create and test your initialization scripts (grooming data as appropriate)
  2. Create a Dockerfile for your derived image that copies your init scripts

    FROM mongo:3.4
    COPY seed-data.js /docker-entrypoint-initdb.d/
    
  3. Build your docker image

    docker build -t mongo-sample-data:3.4 .
    
  4. Optionally, push your image to a docker registry for others to use

  5. Run your docker image

    docker run                               \
        --name mongo-sample-data             \
        -p 27017:27017                       \
        --restart=always                     \
        -e MONGO_INITDB_DATABASE=application \
        -d mongo-sample-data:3.4
    

By default, docker-entrypoint.sh will apply your scripts to the test db; the above run command env var MONGO_INITDB_DATABASE=application will apply these scripts to the application db instead. Alternatively, you could create and switch to different dbs in the js file.

I have a github repo that does just this - here are the relevant files.

Steve Tarver
  • 3,030
  • 2
  • 25
  • 33
  • 3
    Not only should this be the answer, but it makes it easy to plug-and-play with docker-compose. – Zack Aug 19 '18 at 16:09
11

with the latest release of mongo docker , something like this works for me.

FROM mongo
COPY dump /home/dump
COPY mongo_restore.sh /docker-entrypoint-initdb.d/

the mongo restore script looks like this.

#!/bin/bash
# Restore from dump
mongorestore --drop --gzip --db "<RESTORE_DB_NAME>" /home/dump

and you could build the image normally.

docker build -t <TAG> .
user10434757
  • 111
  • 1
  • 3
9

First create a docker volume

docker volume create --name mongostore

then create your mongo container

docker run -d --name mongo -v mongostore:/data/db mongo:latest

The -v switch here is responsible for mounting the volume mongostore at the /data/db location, which is where mongo saves its data. The volume is persistent (on the host). Even with no containers running you will see your mongostore volume listed by

docker volume ls

You can kill the container and create a new one (same line as above) and the new mongo container will pick up the state of the previous container.

Initializing the volume Mongo initializes a new database if none is present. This is responsible for creating the initial data in the mongostore. Let's say that you want to create a brand new environment using a pre-seeded database. The problem becomes how to transfer data from your local environment (for instance) to the volume before creating the mongo container. I'll list two cases.

  1. Local environment

    You're using either Docker for Mac/Windows or Docker Toolbox. In this case you can easily mount a local drive to a temporary container to initialize the volume. Eg:

    docker run --rm -v /Users/myname/work/mongodb:/incoming \
      -v mongostore:/data alpine:3.4 cp -rp /incoming/* /data
    

    This doesn't work for cloud storage. In that case you need to copy the files.

  2. Remote environment (AWS, GCP, Azure, ...)

    It's a good idea to tar/compress things up to speed the upload.

    tar czf mongodata.tar.gz /Users/myname/work/mongodb
    

    Then create a temporary container to untar and copy the files to the mongostore. the tail -f /dev/null just makes sure that the container doesn't exit.

    docker run -d --name temp -v mongostore:/data alpine:3.4 tail -f /dev/null
    

    Copy files to it

    docker cp mongodata.tar.gz temp:.
    

    Untar and move to the volume

    docker exec temp tar xzf mongodata.tar.gz && cp -rp mongodb/* /data
    

    Cleanup

    docker rm temp
    

You could also copy the files to the remote host and mounting from there but I tend to avoid interacting with the remote host at all.

Disclaimer. I'm writing this from memory (no testing).

Bernard
  • 16,149
  • 12
  • 63
  • 66
  • I had already thought about that. It's a good approach, however, I'm creating a demo environtment. So, this demo environtment has to be able to set up on whichever computer we'd like to. The solution would be to create an data volume as an image and pull it. Is it possible? – Jordi Sep 06 '16 at 12:52
  • I'll add more info on how to initialize the volume data, which is what you're asking. – Bernard Sep 06 '16 at 12:56
  • Perfect! Which's the content of incoming ot `mongodata.tar.gz`. Guess I want to create a database `dbOne` and two collections `col1` and `col2` and insert some data to them. How would I have to do that? – Jordi Sep 06 '16 at 13:34
  • You can start from an empty volume as I described first, let mongo create the initial data and then create your seed data into it (the collections). Then you can use the technique I showed next but in reverse to copy the data from the volume into your local host (or S3 or wherever). eg: `docker run -d --name temp -v mongostore:/data alpine:3.4 tar czf data.tar.gz /data/* && tail -f /dev/null` followed by `docker cp temp:data.tar.gz .` – Bernard Sep 06 '16 at 13:42
  • It's what's in the mongostore volume. The volume is mounted at /data on the container, and that's what's tar'd/gzip'd. Remember that it's the same volume that's mounted on the mongo container. It has the mongo database. – Bernard Sep 06 '16 at 13:52
  • Once `mongostore` is created, how can I insert data to it, or create a database without a mongo client? – Jordi Sep 06 '16 at 13:57
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/122755/discussion-between-alkaline-and-jordi). – Bernard Sep 06 '16 at 14:08
  • 1
    No need to do a `docker volume create` unless you need different options, it will make one for you on the `docker run`. And named volumes initialize with the contents of the mount point in the image when you `docker run` with an empty/new volume, making the initialization easy (this isn't the case with host volumes). – BMitch Sep 06 '16 at 18:12
  • @BMitch That's correct. No need for the extra volume create instruction. It just makes the steps clearer in my opinion (shows that the volume is really independent of the container). I'm also using different volume drivers so I'm just used to creating independently. Cheers. – Bernard Sep 06 '16 at 22:07
3

Here is how its done with docker-compose. I use an older image of mongo but the docker-entrypoint.sh accepts *.js and *.sh files for all versions of the image.

docker-compose.yaml

version: '3'

services:
  mongo:
    container_name: mongo
    image: mongo:3.2.12
    ports:
      - "27017:27017"
    volumes:
      - mongo-data:/data/db:cached
      - ./deploy/local/mongo_fixtures /fixtures
      - ./deploy/local/mongo_import.sh:/docker-entrypoint-initdb.d/mongo_import.sh

volumes:
  mongo-data:
    driver: local

mongo_import.sh:

#!/bin/bash
# Import from fixtures

mongoimport --db wcm-local --collection clients --file /fixtures/properties.json && \
mongoimport --db wcm-local --collection configs --file /fixtures/configs.json

And my monogo_fixtures json files are the product of monogoexport which have the following format:

{"_id":"some_id","field":"value"}
{"_id":"another_id","field":"value"}

This should help those using this without a custom Dockefile, just using the image straight away with the right entrypoint setup right in your docker-compose file. Cheers!

radtek
  • 34,210
  • 11
  • 144
  • 111
  • 1
    Thanks for this. I had to share it with something like this: - ./deploy/local/mongo_fixtures:/fixtures otherwise the files were not share correctly – fredde Jun 16 '20 at 13:17
0

I've found a way that is somehow easier for me.

Say you have a database in a docker container on your server, and you want to back it up, here’s what you could do.

What might differ from your setup to mine is the name of your mongo docker container [mongodb] (default when using elastic_spence). So make sure you start your container first with --name mongodb to match the following steps:

$ docker run \
 --rm \
 --link mongodb:mongo \
 -v /root:/backup \
 mongo \
 bash -c ‘mongodump --out /backup --host $MONGO_PORT_27017_TCP_ADDR’

And to restore the database from a dump.

$ docker run \
 --rm \
 --link mongodb:mongo \
 -v /root:/backup \
 mongo \
 bash -c ‘mongorestore /backup --host $MONGO_PORT_27017_TCP_ADDR’

If you need to download the dump from to your server you can use scp:

$ scp -r root@IP:/root/backup ./backup

Or upload it:

$ scp -r ./backup root@IP:/root/backup

P.S: Original source by Tim Brandin available at https://blog.studiointeract.com/mongodump-and-mongorestore-for-mongodb-in-a-docker-container-8ad0eb747c62

Thank you!

Musikolo
  • 3
  • 2