Docker Shallow Dive

2019/08/04

Categories: Architecture Tags: docker docker-engine

Do you need to drill down beyond the basic commands of Docker? What actually happens with “docker build…” or “docker run…” ? Perhaps.

Docker is a platform first, container solution second.

docker

We can trace the roots of Docker way back to a company called dotCloud, and a Python tool called “dc” as a wrapper for LXC and AUFS.
As Docker started expanding, it relied heavily on LXC for critical operations. The nature of faster LXC development introduced issues and breaking changes for Docker. The solution, “libcontainer” (now called runc) became a replacementation for LXC, which is under Dockers control and “dc” actually turned into Docker. In the early phase, Docker grew into a monolith, implementing all sorts of services under one roof from http servers to orchestrator services (in a way if Kubernetes wanted to integrate Docker, it would also integrate it’s orchestrator solution, which was not optimal).
The natural follow up was to refactor everything into separate tools and components! At the same time we see the rise of the Open Container Initiative - a standardization on the horizon, which gives us the rise of Image spec and Runtime spec. Fast forward, we have the Docker we know it today. The stack is slim and optimized starting with the Docker client, a daemon implementing the REST api, containerd the supervisor of lifecycle operations (start, stop and the like), and the OCI layer that interfaces with the host kernel.

So to answer the question from above;
What actually happens when we type $docker container run…? The client sends a REST POST call to the Daemon. The Daemon calls the containerd daemon process which starts a shim process in order to execute runc (OCI layer) which actually ends up creating the container. And from the containerd daemon, the process repeats itself. The best part, you can restart the Docker Daemon process and not affect the running containers - what joy for production environments!

While on the topic of containers, what are the other components in play? How do you actually get a constructed container? The prerequisite for a containerized application is a Dockerfile. This is a specification for containerd on how to package and wrap dependencies and properties in one place. Based on the steps defined, you can fine-tune what to do on which step. The finished image is stored in the Docker registry, ready for use anytime, anywhere.
As an example you can run nodejs with npm install as root and then switch to a different user to run the same app in container mode with controlled restrictions, read more about nodejs best practices here.



All right, keeping a shallow dive in mind, I’ll touch on one additional topic - Volumes.

To get straight to the point, a MySql container needs to have connected storage to save data to. If the container goes down, or if you build a newer version of the image and unless you have a separate mounted volume, you’ll lose all data that was bounded to the container space. This gets even more important when working with Kubernetes and the role of replica sets - but on that topic some other time.

As a quick test, try setting up Jenkins as a container - get a Jenkins image from Docker hub (and build it). Configure a job and restart the container, what state will you get? (Hint: nothing saved and you’re back to the installation setup!).
In order to solve this, create a volume and start up the Jenkins container with a similar command:

$ docker volume create myvol
$ docker run -d -v myvol:/var/jenkins_home -p 8080:8080 name-of-jenkins-image

Next, go once more trough the setup process and restart the container with the above run command. You’ll see that no matter how many times the container is restarted, the configuration is preserved and loaded! Which means you always mount a volume (myvol) for the container to use. In this example if you update some plugins, they will be saved too!

Up till here I cover the view on GNU/Linux based hosts (MacOS should feel at home too). Windows is on a good way adopting this approach and future updates might consolidate the same build process. Keeping an eye on this area is crucial if you’re interested. It is an ever evolving ecosystem and personal blog posts often fail to follow up on the changes. So, by abstracting this topic a bit, I’m able to keep it relevant a bit longer :)

The subject of this post is getting fairly complex and the task of mapping out a simplified overview is already challenging in itself. I would recommend to isolate each component of interest and go through the official Docker documentation.

Keep curious and have fun!