Crash Course
Conceptually you can think of a docker container as a newly created VM containing the bare essentials for an OS.
The docker image is like a VM template. Containers are the live instances of the image. We specify how to build an image using a Dockerfile
, much like a vagrantfile
. It contains the libraries, programs, and configuration we need to run what ever application we would like to run in a container.
Consider this simplified example from nginx:
# Choose base image (in this case ubuntu OS)
FROM dockerfile/ubuntu
# Install nginx
RUN apt-get update && apt-get install nginx
# Define default command to run when the container starts.
# i.e the nginx webserver
CMD ["nginx"]
# Expose ports. Allowing our webserver to be accessible outside the container.
EXPOSE 80
EXPOSE 443
The dockerfile is really simple - a quick installation and some minor configuration. The real nginx dockerfile has a few more optimizations, and configuration steps like setting permissions, environment variables, etc.
Why Are Images Useful?
The usefulness of images/containers is that they can be shared around and deployed on any machine with a running docker daemon. This is really useful for development workflow. Instead of trying to replicate production, staging, dev environments to reproduce bugs etc. we can save the container as an image and pass it around.
JVM stuff
Docker images are like building blocks sharing parts that are the same and only adding on bits that are new (which means less disk space usage for us!). If you have multiple applications that require a JVM you would use a java base image. It does mean multiple instances of the JVM are running but that is a tradeoff/design issue you would make when choosing docker.
Data Containers
These are abit confusing, they basically allow your data to become portable just like your application containers. They aren't necessary, simply another design decision. You can still export DB data to CSV and all the usual methods of moving it around from within your application container. I personally don't use data containers in my workflow as I'm dealing with TBs of data and data portability is not a huge concern. I use volumes instead, you can tell docker to use a host file system directory to store its data in. This way the data is stored persistently on the host irrespective of the lifetime of the docker container or image.
Build
We'll discuss this first then developer workflow will make more sense.
There really are 2 main ways of going about this:
If continuous integration is your goal, I find volumes are the way to go. Your docker containers would use volumes to mount their application source code on the host filesystem. This way all you'd have to do is pull the source code, restart the container (to ensure changes to the source code are picked up), then run your tests. The build process would really be no different to without docker. I prefer this approach because its fast and secondly the application's dependencies, environment etc. often don't change so rebuilding the image is overkill. Mounting source code also means you can make changes in place if times are desperate
The slower alternative, like the one you described, is to 'bake' source code into the image at build time. You would pull new source code, build the image, (optional - push to private docker registry), deploy the container, and then run your tests. This has the advantage of being totally portable but the turnaround time to rebuild and distribute image for every small code change can be painstaking.
Workflow
Docker's purpose is for specifying the environment for applications to run in. From this perspective developers should continue to work on application code as normal. If a developer would like to test code in a container they'd build an image locally and deploy a container from it. If they wanted to test in a production or staging image you could distribute that to them.
Lastly, the most simple pro tip for working with containers :)
To login to a container and explore whats going on you can run
docker exec -it container-name bash
Disclaimer
I'm aware of some over simplifications in my explanations. My goal was to add as little confusion and new terms as possible. I find this only complicates things tasking away from the core ideas, use cases, etc. which the OP seemed most concerned with.