Docker Series - Docker part II

I’ll be more explicit about some commands that you may want to use in your Dockerfile and explain how Docker works through them. For introduction to Docker see Part I

A docker container will keep running in the background as long as the initial command executed within the container is still running.

CMD and ENTRYPOINT

You can define the command that will be run inside the container.

So this setup defines that when we invoke

The ENTRYPOINT part cannot be overriden at runtime, while the CMD part supplies a default value that can be overriden.

So to be usefull, this container needs some extra parameters to output something else than the java version inside it.

When building the container from the Docker file, the commands inside the Dockerfile are not run in a single(the same) container. After each command docker stops the container, writes the delta from the previous one. It then runs this newly saved container in order to execute the next command in the file. This explains also why you’ll see no cd command by itself in the Dockerfile since it’s efect is not saved. The cd commands are inlined with the actual commands. So this raises the issue how can you run multiple processes inside a container(lets say Tomcat and SSH). You’d want Tomcat to serve your app and ssh for you to be able to login and check say the tomcat logs from inside the container.

One might think can just start SSH as a background service in the container and then start Tomcat something like:

but this wouldn’t work since when docker will be processing the service tomcat7 start command, it will have shutdown the previous container in which sshd was running.

So it seems there is a big problem that actually docker can only be running a single process . The answer is a tool like Supervisor which itself takes care of starting other processes, redirecting their output to log files and restarts them when they die.

Example Dockerfile

here is a simple supervisord.conf example:

EXPOSE

So far as the applications running inside the container are concerned, they will be running on the container’s private ports. You exposes a port from the container to be able to receive connections from other containers or outside. When started with the CLI run option -p interface:host_port:container_exposed_port option we can bound that port to an interface of the host. For example we can expose Tomcat inside the container to receive connections directly from the internet,but SSH only for internal connections

and you can access http://host_ip:9000 and the Tomcat inside the container will respond.

If you don’t explicitly map the private port, a random port on the default docker interface will be mapped.

Sidenote Containers when running are assigned an internal ip address something like, and you can find their ip(and a lot other info) by running:

to check all the running containers IPs:

ADD

Copy some files from the HOST FS to the temporary CONTAINER and therefore be saved inside of the container image.

Example: copy the user’s public key inside the container authorized_keys for easier ssh login through key authentication.

Doesn’t matter if you change the files you ADDed after building the image, they are frozen inside the container image as they were at the time of the creation. For the container to see the new file version you need to rebuild the image.(For handling dynamic changing files see VOLUME bellow).

Remember that when rebuilding an image, Docker did cache the commands in the Dockerfile. Since you want the image to contain the new version, it means that the ADD command resets the cache for itself and following commands, therefore it’s good practice to place the ADD clause at the foot of the Dockerfile. As of version 0.8 Docker has improved the caching mechanism to not reset the cache for ADD if the added files did not change(cache key is timestamp related).

VOLUME

A volume can be thought like a directory external to the container that can be mounted at runtime

Volumes are immensely useful because:

  • They can be a good way to “configure” at runtime a container. Imagine for example passing different configuration directories for each container

through the docker run -v command you can actually bind any folder inside the container, the folder doesn’t necessarily have to have been defined in the Dockerfile through the VOLUME directive. So does the VOLUME directive has any “real” use? Yes, it does as it marks that directory with the content inside it as not going inside the image.

  • They can be a way to share data / deliverables between containers and also the host.
  • More convenient backups of data.

Mounted volumes are not part of the container image. Changes to a data volume are made directly, without the overhead of a copy-on-write mechanism. This is good for very large files.

As a realcase example I also mapped my project jar / war files from the host as a volume inside the container. That way, when I need to change something in the sources and rebuild I only need to do it once on the host and then the containers will have the new package version without having to rebuild them. The alternative of writing them(though either git pull or ADD) into the image would mean very tedious process of rebuilding and leftover images for each rebuild. Ofcourse once your files are in a stable releasable state writing them inside the image makes more sense.

Mapping a /logs volume for each container on the host I think it’s a good ideea since it makes checking the log files and backup more convinient.

ENV

Subsequent Dockerfile commands will see this property.

With docker you can pass different environment properties at runtime and which means added flexibility because you can read this directly in your program or through bash scripts and for example generate configuration files for processes.

Something like:

This script uses sed to replace in the env.properties file constructs like bellow with the ENV value

and can run with

Storing the Docker images in a custom location.

I’m running a 64GB SSD on my laptop and I’d rather keep the images and containers on a 16GB stick. By default docker keeps it’s data in /var/lib/docker. We can change that by editing the docker config file /etc/default/docker to pass extra parameters to the docker daemon:

and bind mount the location:

Restart the docker daemon with

Docker containers IPs:

or to show all running container’s IPs:

Run docker command without sudo

The docker client(the docker command) communicates with the docker daemon through a . You can change the ownership of the . By default docker will bind the socket to the docker group if that group exists.

Docker Gotchas

Containers are NOT ephemeral After a container executes it’s process or when you’ve stoped them explicitly, they still exist(that’s why you cannot delete the image they’re based on). The log files and state when stopped still exists inside the container, if you restart them, they will NOT start fresh from the “clean state” of the snapshot image. Some (including myself) thought that’s the reason.

Still it can be helpful to keep separate volumes for data and logs dir so I can easily backup or check all of them.

NOT easy to set a static IP for a container Docker creates a special Linux bridge network called docker0 on startup. All containers are automatically connected to this bridge network and the IP subnet for all containers is randomly set by Docker. Currently, it is not possible to directly influence the particular IP address of a Docker container. Restarted containers might get a different IP(0.7 version).

Files like “/etc/hosts” and “/etc/resolv.conf” are readonly inside container Combined with the issue above it’s not very easy to reference other containers by name with an easy setup.

UPDATE: you can use **docker –link ** container_name, or docker-compose to link containers and then you can reference a host by name. Or use docker network to create subnetworks for isolation.

Layer limitations

Dockerfile instructions are repeatable, but at present the AUFS limit of 42 layers means you’re encouraged to group similar commands where possible (i.e. combining separate apt-get install lines into one RUN command). The impact of this is that a single change to a long list of required apt-get packages means invalidating Docker’s build cache for that command and all those which follow it.

Docker and Ansible

Sometimes you may require extensive configuration for each container. I’ve seen a lot of people which recomend as a provisioning tool Ansible. It’s written in Python and seems easier to grasp than other provisioning tools like Chef or Puppet. Some suggested that you start with a basic container, have it configured through Ansible and in the end commit the result and just forget about the Dockerfile. Here's as a very good starting point for it.

References

Docker guide Automated deployment with Docker – lessons learnt