A Not Very Short Introduction to Docker

This is the notes that accompany my presentation called Docker, the Future of DevOps. It turned out, quite fittingly, to be a whale-sized article :).


What Is Docker And Why Should You Care?

Contrary to many others I believe that saying that Docker is a lightweight virtual machine is a very good description. Another way to look at Docker is chroot on steroids. The last explanation probably doesn't help much unless you know what chroot is.

VM vs. Docker


The image describes the difference between a VM and Docker. Instead of a hypervisor with Guest OSes on top, Docker uses a Docker engine and containers on top. Does this really tell us anything? What is the difference between a "hypervisor" and the "Docker engine"? A nice way of illustrating this difference is through listing the running processes on the Host.

The following simplified process trees illustrates the difference.

On the Host running the VM there is only one process running on the Host even though there are many processes running in the VM.

On the Host running the Docker Engine all the processes running are visible. The contained processes are running on the Host! They can be inspected and manipulated with normal commands like, ps, and kill.

Now when everything is crystal clear, what does this mean? It means that Docker containers are smaller, faster, and more easily integrated with each other than VMs as the table illustrates.


The size of a small virtual machine image with Core OS is about 1.2 GB. The size of a small container with busybox is 2.5 MB.

The startup time of a fast virtual machine is measured in minutes. The startup time of a container is often less than a second.

Integrating virtual machines running on the same host must be done by setting up the networking properly. Integrating containers is supported by Docker out of the box.

So, containers are lightweight, fast and easily integrated, but that is not all.

Docker is a Contract

Docker is also the contract between Developers and Operations. Developers and Operations often have very different attitudes when it comes to choosing tools and environments.

Developers want to use the next shiny thing, we want to use Node.js, Rust, Go, Microservices, Cassandra, Hadoop, blablabla, blablabla, …

Operations want to use the same as they used yesterday, what they used last year, because it is proven, it works!

(Yes, I know this is stereotypical, but there is some truth in it :)

But, this is where Docker shines. Operations are satisfied because they only have to care about one thing. They have to support deploying containers. Developers are also happy. They can develop with whatever the fad of the day is and then just stick it into a container and throw it over the wall to Operations. Yippie ki-yay!


But, it does not end here. Since Operations are, usually, better than development when it comes to optimizing for production, they can help developers build optimized containers that can be used for local development. Not a bad situation at all.

Better Utilization

A few years ago, before virtualization, when we needed to create a new service, we had to acquire an actual machine, hardware. It could take months, depending on the processes of the company you were working for. One the server was in place we created the service and most of the time it did not become the success we were hoping for. The machine was ticking along with a CPU utilization of 5%. Expensive!

Then, virtualization entered the arena and it was possible to spin up a new machine in minutes. It was also possible to run multiple virtual machines on the same hardware so the utilization increased from 5%. But, we still need to have a virtual machine per service so the we cannot utilize the machine as much as we would want.

Containerization is the next step in this process. Containers can be spun up in seconds and they can be deployed at a much more granular level than virtual machines.



It is indeed nice that Docker can help us speed up our slow virtual machines but why can't we just deploy all our services on the same machine?

You already know the answer, dependency hell. Installing multiple independent services on a single machine, real or virtual, is a recipe for disaster. Docker Inc. calls this the matrix of hell.


Docker eliminates the matrix of hell by keeping the dependencies contained inside the containers.



Speed is of course always nice, but being 100 times faster is not only nice, it changes what is possible. This much increase enables whole new possibilities. It is now possible to create throw-away environments. Need to change your entire development environment from Golang to Clojure? Fire up a container. Need to provide a production database for integration and performance testing? Fire up a container. Need to switch the entire production server from Apache to Nginx? Fire up a container!

How Does Docker Work?

Docker is implemented as a client-server system; The Docker daemon runs on the Host and it is accessed via a socket connection from the client. The client may, but does not have to, be on the same machine as the daemon. The Docker CLI client works the same way as any other client but it is usually connected through a Unix domain socket instead of a TCP socket.

The daemon receives commands from the client and manages the containers on the Host where it is running.


Docker Concepts and Interactions

  • Host, the machine that is running the containers.
  • Image, a hierarchy of files, with meta-data for how to run a container.
  • Container, a contained running process, started from an image.
  • Registry, a repository of images.
  • Volume, storage outside the container.
  • Dockerfile, a script for creating images.


We can build an image from a Dockerfile. We can also create an image by commiting a running container. The image can be tagged and it can be pushed to and pulled from a registry. A container is started by runing or createing an image. A container can be stopped and started. It can be removed with rm.



An image is a file structure, with meta-data for how to run a container. The image is built on a union filesystem, a filesystem built out of layers. Every command in the Dockerfile creates a new layer in the filesystem.

When a container is started all images are merged together into what appears to the process as unified. When files are removed in the union file system they are only marked as deleted. The files will still exist in the layer where they were last present.

Image Sizes

Here are some data on commonly used images:

  • scratch – this is the ultimate base image and it has 0 files and 0 size.
  • busybox – a minimal Unix weighing in at 2.5 MB and around 10000 files.
  • debian:jessie – the latest Debian is 122 MB and around 18000 files.
  • ubuntu:14.04 – Ubuntu is 188 MB and has around 23000 files.

Creating images

Images can be created with docker commit container-id, docker import url-to-tar, or docker build -f Dockerfile .

As you can see from the above session, it is possible to create images with docker commit but it is kind messy and it is hard to reproduce. It is better to create images with Dockerfiles since they are clear and are easily reproduced.

Build it with

Since every command in the Dockerfile creates a new layer it is often better to run similar commands together. Group the commands with and and split them over several lines for readability.

The ordering of the lines in the Dockerfile is important as Docker caches the intermediate images, in order to speed up image building. Order your Dockerfile by putting the lines that change more often at the bottom of the file. ADD and COPY get special treatment from the cache and are re-run whenever an affected file changes even though the line does not change.

Dockerfile Commands

The Dockerfile supports 13 commands. Some of the commands are used when you build the image and some are used when you run a container from the image. Here is a table of the commands and when they are used.


BUILD Commands

  • FROM – The image the new image will be based on.
  • MAINTAINER – Name and email of the maintainer of this image.
  • COPY – Copy a file or a directory into the image.
  • ADD – Same as COPY, but handle URL:s and unpack tarballs automatically.
  • RUN – Run a command inside the container, such as apt-get install.
  • ONBUILD – Run commands when building an inherited Dockerfile.
  • .dockerignore – Not a command, but it controls what files are added to the
    build context. Should include .git and other files not needed when building
    the image.

RUN Commands

  • CMD – Default command to run when running the container. Can be overridden
    with command line parameters.
  • ENV – Set environment variable in the container.
  • EXPOSE – Expose ports from the container. Must be explicitly exposed by the
    run command to the Host with -p or -P.
  • VOLUME – Specify that a directory should be stored outside the union file
    system. If is not set with docker run -v it will be created in
  • ENTRYPOINT – Specify a command that is not overridden by giving a new
    command with docker run image cmd. It is mostly used to give a default
    executable and use commands as parameters to it.

Both BUILD and RUN Commands

  • USER – Set the user for RUN, CMD and ENTRYPOINT.
  • WORKDIR – Sets the working directory for RUN, CMD, ENTRYPOINT, ADD and

Running Containers


When a container is started, the process gets a new writable layer in the union file system where it can execute.

Since version 1.5, it is also possible to make this layer read-only, forcing us to use volumes for all file output such as logging, and temp-files.

docker run


As the list above describes, docker run is the command used to start new containers. Here are some common ways to run containers.

This is the way to run a container if you want to interact with it as a normal terminal program. If you want to pipe into the container, you should not use the -t option.

  • --interactive (-i) – send stdin to the process.
  • -tty (-t) – tell the process that a terminal is present. This affects how the process outputs data and how it treats signals such as (Ctrl-C).
  • --rm – remove the container on exit.
  • ubuntu – use the ubuntu:latest image.

  • --detached (-d) – Run in detached mode, you can attach again with docker

docker run –env

  • --name – name the container, otherwise it gets a random name.
  • -env (-e) – Set the environment variable in the container
  • --env-file – Set all environment variables in env-file
  • mysql – use the mysql:latest image.

docker run –publish

The nginx image, for example, exposes port 80 and 443.

Linking a container sets up networking from the linking container into the linked container. It does two things:

  • It updates the /etc/hosts with the link name given to the container, db in the example above. Making it possible to access the container by the name db. This is very good.
  • It creates environment variables for the EXPOSEd ports. This is practically useless since I can access the same port by using a hostname:port combination anyway.

The linked networking is not constrained by the ports EXPOSEd by the image. All ports are available to the linking container.

docker run limits

It is also possible to limit how much access the container has to the Host's resources.

Setting CPU shares to 512 out of 1024 does not mean that the process gets access to half of the CPU, it means that it gets half as many shares as a container that is run without any limit. If we have two containers running with 1024 shares and one with 512 shares the 512-container will get about 1 fifth of the CPU shares.

docker exec container

docker exec allows us to run commands inside already running containers. This is very good for debugging among other things.



Volumes provide persistent storage outside the container. That means the data will not be saved if you commit the new image.

Since the directory of the host is not given, the volume is created in

The exact name of the directory can be found by running docker inspect container-id.

It is also possible to mount volumes from another container with --volumes-from.

Docker Registries

Docker Hub is the official repository for images. It supports public (free) and private (fee) repositories. Repositories can be tagged as official and this means that they are curated by the maintainers of the project (or someone connected with it).

Docker Hub also supports automatic builds of projects hosted on Github and Bitbucket. If automatic build is enabled an image will automatically be built every time you push to your source code repository.

If you don't want to use automatic builds, you can also docker push directly to Docker Hub. docker pull will pull images. docker run with an image that does not exist locally will automatically initiate a docker pull.

It is also possible to host your images elsewhere. Docker maintains code for docker-registry on Github. But, I have found it to be slow and buggy.

Quay, Tutum, and Google also provides hosting of private docker images.

Inspecting Containers

A lot of commands are available for inspecting containers:

I will only elaborate on docker ps and docker inspect since they are the most important ones.

Tips and Tricks

To get the id of a container is useful for scripting.

docker inspect can take a format string, a Go template, and it allows you to be more specific about what data you are interested in. Again, useful for scripting.

Use docker exec to interact with a running container.

Use volumes to avoid having to rebuild an image every time you run it. Every time the below Dockerfile is built it copies the current directory into the container.

To avoid the rebuild, build the image once and then mount the local directory when you run it.



You may have heard that it is not secure to use Docker. This is not untrue, but it does not have to be a problem.

The following security problems currently exists with Docker.

  • Image signatures are not properly verified.
  • If you have root in a container you can, potentially, get root on the entire box.

Security Remedies

  • Use trusted images from your private repositories.
  • Don't run containers as root, if possible.
  • Treat root in a container as root outside a container

If you own all the containers running on the server, you don't have to worry about them interacting with each other maliciously.

Container "Options"


I put "options" in quotes since there are not really any options at the moment, but a lot of players want to get in the game. Ubuntu is working on something called LXD and Microsoft on something called Drawbridge. But, the one that seems most interesting is the one called Rocket.

Rocket is developed by Core OS, who is a big container (Docker) platform. The reason for developing it is that they feel that Docker Inc. are bloating Docker and also that they are moving into the same area as Core OS, which is container hosting in the cloud.

With this new container specification they are trying to remove some of the warts which Docker has for historical reasons and to provide a simple container with support for socket activation and security built in from the start.


When we split up our application into multiple different containers we get some new problems. How do we make the different parts talk to each other? On a single host? On multiple hosts?

Docker solves the problem of orchestration with on single host with links.

To simplify the linking of containers Docker provides a tool called docker-compose. It was previously called fig and was developed by another company which was recently acquired by Docker.



docker-compose declares the information for multiple containers in a single file, docker-compose.yml. Here is an example of a file that manages two containers, web and redis.

To start the above containers, you can run the command docker-compose up.

It is also possible to start the containers in detached mode with docker-compose up -d and I can find out what containers are running with docker-compose ps.

It is possible to run commands that work with a single container or commands that work with all containers at once.

As you can see from the above commands, scaling is supported. The application must be written in a way that can handle multiple containers. Load-balancing is not supported out of the box.

Docker Hosting

A number of companies want to get in on the business of hosting Docker in the cloud. The image below shows a collection.


These provider try to solve different problems, from simple hosting to becoming a "cloud operating system". I will only elaborate on two of them

Core OS


As the image shows, Core OS is a collection of services to enable hosting of multiple containers in a Core OS cluster.

  • The Core OS Linux distribution is a stripped down Linux. It uses 114MB of RAM on initial boot. It does not provide a package manager, since it uses Docker or their own Rocket container to run everything.
  • Core OS uses Docker (or Rocket) to install an application on a host.
  • It uses systemd as init-service since it has great performance, handles start-up dependencies well, has great logging, and supports socket-activation.
  • etcd is a distributed, consistent key value store for shared configuration and service discovery.
  • fleet is a cluster manager. It is an extension of systemd to work with multiple machines. It uses etcd to manage configuration and it is running on every Core OS machine.


It is possible to host Docker containers on Amazon in two ways.

  • Elastic Beanstalk can deploy Docker containers. This works fine but I find it to be very slow. A new deploy takes several minutes and it does not feel right when a container can be started in seconds.
  • ECS, Elastic Container Service, is Amazon's upcoming container cluster solution. It is currently in preview 3 and it looks very promising. Just as with Amazon's other services, you interact with it through simple web service calls.


  • Docker is here to stay.
  • It fixes dependency hell.
  • Containers are fast!
  • Cluster solutions exists, but don't expect them to be seamless, yet!

This Post Has 33 Comments

  1. Thanks your share

    1. You’re welcome!

  2. very concise and useful, thank you

    1. You’re welcome :)

  3. Nice and great information.

    1. You’re welcome, I’m glad you liked it!

    1. Jaime, you are welcome, I’m glad you liked it :)

  4. Nice! This was very useful as a Docker newb.

  5. I summarized this blog , It is the best explanation why Docker and not VM. It has a brilliant simplicity that made the entire Silicon Valley go mad for Jayway’s blog


    Docker ‏@docker
    In this week’s #DockerWeekly, “Why @Docker is a winner vs #VMs” by @myinnervoice: http://my-inner-voice.blogspot.com/2015/08/why-docker-is-winner-versus-vms.html … #containers
    Embedded image permalink
    62 retweets 56 favorites
    Reply Retweeted 62
    Favorited 56

  6. are these not good cluster solutions?
    1. datacentre OS
    2. kubernetes..

  7. Great information. Just right – enough technical meat to get a solid understanding of underpinnings and use by development or operational aspects. Thank you!

    1. You are welcome. I’m glad you liked it.

  8. Thanks for sharing. Please add the documentations for docker on azure…

  9. Thanks for the precise write-up, it is one stop to understand and work with docker.

  10. You may want to add ‘-y’ flag to ‘apt-get install postgresql’
    (this section is above “Build with”)

    1. Thanks, fixed it.

  11. Thanks, very good for a newbie! (I guess that almost all of us +400 students from Blekinge Tekniska Högskolas new course Cloud Computing have visited you! :-D )

    1. Jenny,
      Thanks for letting me know.
      I’m glad you like it.

  12. Excellent post. I love the analogy of Docker as a contract between Dev and Ops.

    1. Thanks Matt! I’m glad you like it.

  13. Hi, Nice post
    How can I add env variables on running container without stopping the existing container

    1. You can exec into the container and set the environment variables inside the container.

      docker exec -it adoring_joliot bash
      root@4be0f25f6e29:/# export dingo=tapir
      root@4be0f25f6e29:/# export TERM=term
      root@4be0f25f6e29:/# echo $TERM

      But this will not the process running inside the container. To be able to update the environment of a running process
      you have to resort to some Linux magic, such as this: http://stackoverflow.com/questions/205064/is-there-a-way-to-change-another-processs-environment-variables#answer-211064

  14. Absolutely fantastic write-up and The Best overall description I’ve ever seen on Docker. It goes through all the useful parts in a meaningful way with simple examples and accurate explanations on “why”.

    1. Thanks Chris,
      I’m glad you liked it. :)

  15. Nice! Thank you!

  16. It’s awesome. Detailed information. One of the best blogs on Docker, that I have read so far.

Leave a Reply

Close Menu