Singularity Compatibility with Docker Containers

By Staff

Apr 4, 2018 | Blog

Many users first come to Singularity because they want to run an existing Docker container on a system where they can’t run Docker natively. Later on, they may build their own containers via Singularity definition files, often using a Docker image as a starting point. Most of the time things just work with singularity run docker://… and will act just as you expect, dropping you inside your container. However, there are some differences between Singularity and Docker that you may run into when using some of the more complicated containers available on the Docker Hub and elsewhere. Let’s take a look at the issues, some which are addressed in Singularity 2.5 which will be released soon!

Whiteout Files

Singularity images are a single file containing the state of a container’s file system. Docker, on the other hand, constructs containers at run-time from layers. Each layer is a tar archive, containing files that make up the container. A layer is created for each step in the Dockerfile that was used to build the container. The layers must be extracted in the right order to reproduce the container properly.

Some complexity comes in when existing files or directories are deleted during a Dockerfile step. A tar archive can’t represent deletion of a file or folder, so Docker uses special whiteout files, beginning with .wh., to show something needs to be deleted when a layer is processed.

Singularity hasn’t always handled these whiteout files correctly, which can lead to strange errors, like the one below, when you run e.g. complex python-based containers:

PermissionError: [Errno 13] Permission denied: '/usr/local/miniconda/lib/python3.6/site-packages/.wh.conda-4.3.11-py3.6.egg-info'

The .wh. file shouldn’t be there – it’s a marker for deletion, not a real file for the container.

In Singularity 2.5 we have rewritten the code that extracts Docker layers to fix these issues. A release candidate will be available this week. In Singularity 3.0 we will be using the excellent opencontainers/image-tools Go package from the Open Container Initiative. Using official OCI related libraries is one of the advantages of a switch to Go – we can track any changes to OCI & Docker container specifications with far less effort, and fewer issues.

USER in Dockerfiles

In a Dockerfile the USER command is available to make sure that your container, or a build step, is run as a particular user. A Dockerfile may create users inside the container with their own home directory. For example, the `finmag/finmag` container does this:

<ENV NB_USER finmaguser
  ENV NB_UID 1000
  ENV HOME /home/${NB_USER}
  ...
  USER root
  RUN chown -R ${NB_UID} ${HOME}
  RUN chown -R ${NB_UID} /finmag
  USER ${NB_USER}

Under Docker you would enter the container as the user finmaguser. With Singularity you can’t become a different user inside a container. Part of the security model that makes Singularity useful on shared systems, such as HPC clusters, is to ensure the user inside the container is the same user who started the container.

Most of the time containers built with USER will still run under Singularity, as we make efforts to modify the permissions on files so they will be accessible to the user who will run the container. If the container is written expecting to access files in that user’s $HOME directory things may not start as expected. You can usually get around this by using singularity shell to enter the container, cd to the home directory and call the relevant files directly. Our Slack channel, or Google Groups list are good places to ask for help if needed.

Environment & Home Directory

When you run a Singularity container your home directory and most environment variables from the host are passed into the container. This makes Singularity containers easy to integrate into an HPC workflow, mixing with traditional software installations, but can give unexpected behavior vs Docker.

If I run a python container using docker I can see 4 installed pip packages:

$ docker run python:2.7 pip list
  ...
  pip (9.0.3)
  setuptools (39.0.1)
  virtualenv (15.2.0)
  wheel (0.30.0)

If I run it with Singularity I see a lot more. Where are they coming from?

$ singularity exec docker://python:2.7 pip list
  ...
  clair-singularity (0.1.0)
  coverage (4.5.1)
  funcsigs (1.0.2)
  more-itertools (4.1.0)
  pathspec (0.5.5)
  pip (9.0.3)
  pluggy (0.6.0)
  py (1.5.3)
  pytest (3.5.0)
  pytest-cov (2.5.1)
  PyYAML (3.12)
  requests-toolbelt (0.8.0)
  setuptools (39.0.1)
  virtualenv (15.2.0)
  wheel (0.30.0)
  yamllint (1.11.0)

The answer is that they are in my home directory, packages which I installed using contain flags – which tell Singularity not to pass $HOME and some other things through to the container:

$ singularity exec ––contain docker://python:2.7 pip list
  ...
  pip (9.0.3)
  setuptools (39.0.1)
  virtualenv (15.2.0)
  wheel (0.30.0)

The contain option is very useful when you are working with containers using e.g. Python, R, Ruby and don’t want any local packages or settings to override the default behavior of the container, as seen when running it using Docker.

Another option, -e or ––cleanenv ensures environment variables in set in your terminal are not passed into the container; another potential source of differing behavior between Docker and Singularity. To see the effect compare the output of:

$ singularity exec docker://alpine env

…with…

$ singularity exec ––cleanenv docker://alpine env

Namespaces

Our last gotcha for Docker Compatibility is also due to the differences in isolation for Singularity vs Docker. When you start a Docker container it is isolated from the host via its own:

  • Mount namespace (container has own filesystem)
  • PID namespace (container cannot see host processes)
  • IPC namespace (container processes are separate from host processes for inter-process communication)
  • Network namespace (container does not directly share host networking)
  • UTS namespace (container has a separate hostname which it can manipulate)

By default, Singularity starts containers with a separate mount namespace only. This gives the container its own filesystem, but doesn’t isolate processes and networking in the same way as Docker. Again, this ‘blurring’ between the container and host is an important feature of Singularity. Unfortunately, some Docker containers are written in a way that assume they will have more isolation than Singularity provides by default, and some strange side-effects can occur. This is usually seen when running complex containers providing services which are run by an init system, or process supervisor inside the container.

For example, several containers on Docker Hub use the s6 init system, in a way where it is configured to kill all visible processes when the container is stopped. Because singularity doesn’t use a separate PID namespace, it will try to kill all processes on the host that it has permissions to access. If you are running the container under an HPC job scheduler you might not notice; but running on your desktop or laptop you definitely will!

Let’s take a look at how processes are seen inside a Singularity container; I can see everything running on my desktop PC:

$ singularity exec alpine.simg ps
 PID USER TIME COMMAND
 1 root 3:06 /usr/lib/systemd/systemd ––switched-root ––system ––deserialize 21
 2 root 0:00 [kthreadd]
 3 root 0:04 [ksoftirqd/0]
 5 root 0:00 [kworker/0:0H]
 7 root 0:01 [migration/0]
 8 root 0:00 [rcu_bh]
 ...

Luckily, you can get a separate PID namespace for a Singularity container if needed, via the ––pid option, and the container can then only see its own processes:

$ singularity exec -p alpine.simg ps
  PID USER TIME COMMAND
  1 dave 0:00 ps

Other namespaces can also be requested for other levels of isolation:

  -i| ––ipc Run container in a new IPC namespace
  -n| ––net Run container in a new network namespace (loopback is only network device active)
  -p| ––pid Run container in a new PID namespace
  -u| ––userns Run container in a new user namespace (this allows Singularity to run completely unprivileged on recent
  kernels and doesn't support all features)

Note that the network namespace implementation in Singularity currently only gives access to a localhost network; your container will not be able to reach the outside world. In Singularity 3.x we will be working toward comprehensive options to setup container networking, using the excellent libraries available for Go.

Putting it Altogether

If you really need to get close to the default isolation used by Docker then Singularity offers a shortcut option, -C or ––containall, that combines the effects of ––contain (see above) with a clean environment and separate PID & IPC namespaces.

If you are troubleshooting strange behavior from a Docker container, and don’t need your home directory available, ––containall is a good place to start. Do bear in mind that the blurred lines between host and container are one of the advantages of Singularity, and just using ––containall each time you run a Docker container will likely make it harder to work with!

Thank you for reading through the details and please come back to visit us for some more juicy morsels. Come visit us in our Slack Channel, Google Groups , or GitHubs for more conversation and detail.

Join Our Mailing List

Related Posts

An Introduction to Singularity Containers

Enabling Portable and Secure Computing Environments for High-Performance Workloads.As part of their ongoing efforts to streamline workflows, enhance productivity, and save time, engineers, and developers in enterprises and high performance computing (HPC) focused...

read more