Why Containers are Important in HPC
HPC stands for “high performance computing”. When you think about HPC, you might imagine a computer the size of a room. But what makes HPC different from, say, virtualization that is prevalent in cloud computing? Although a supercomputer may look similar to a large virtualization deployment, the goals are actually quite different. So, what makes HPC different, and when is it useful?
To answer this, it’s useful to look at a practical example: weather forecasting.
Forecasting the weather involves processing and analyzing large amounts of data, with the ultimate goal of accurately predicting tomorrow’s weather. While it might be possible to predict the weather using a single computer, there is so much data to process that doing so may take days, or even weeks with the fastest computer we can find today. Obviously, it’s not that useful if forecasting tomorrow’s weather takes five days, because the whole point is to be able to plan for tomorrow. If we can’t find a faster computer, how can we forecast the weather fast enough? The obvious answer is to use more than one computer.
And this is really what HPC is about. When our problem can’t be solved by one computer, we can use multiple computers to solve the problem. At the end of the day, a supercomputer is really just a cluster of fast computers that are connected in a way that allows individual computers in the cluster to communicate efficiently with each other.
So, what’s the difference between virtualization and HPC? Usually, with virtualization we’re trying to take one computer, and make it look like five computers so that multiple people can use it. With HPC, we’re doing the opposite. We’re taking multiple computers, and using them together to solve a single problem.
Now that we know what HPC is, let’s dive into how to build software that runs on supercomputers.
While it’s possible to build and run software directly on the nodes of a supercomputer, it can often be easier to utilize something called software containers. By packaging up our application, along with all of its dependencies, into a container, the application becomes more portable. One of the most common challenges when writing software is differences in behaviors between different versions of libraries that we use in our code. By using a container, we ensure the versions of the dependencies our code runs with are the same regardless of whether we are using a laptop or supercomputer. This makes it much easier to get work done, and get the weather forecast out in time!
You may have heard of Docker, a container runtime that is quite popular. Docker is awesome, but unfortunately its default security model isn’t compatible with the requirements of most supercomputing environments. Fortunately, there is a container runtime that was purpose built for supercomputing by the name of Singularity. And there’s even a free, open source version of it called SingularityCE (Singularity Community Edition.)
What makes Singularity different from Docker and other container runtimes? Besides having a security model that is compatible with supercomputing environments, it also utilizes a unique image format called the Singularity Image Format, or SIF for short. SIF was purpose built by Sylabs to have great performance on supercomputers, as well as some unique security features which we’ll briefly mention later.
SIF images are single files. This is quite different from the way containers work in other container runtimes like Docker, but it’s an important part of what makes Singularity perform well at scale. Supercomputers often have high performance, parallel storage systems that each node in a cluster can access. When we need to start a single container on a hundred, or a thousand, or even a hundred thousand compute nodes, just getting to the point where the workload is ready to run can take a lot of time. If we were to point a thousand instances of Docker at a single container registry, the registry becomes the bottleneck. By packaging up our container in a SIF file, and placing it on parallel storage, we can dramatically reduce startup time, and start our processing faster.
SIF also has some unique security properties. Images can be encrypted, and then decrypted on the fly by the Linux kernel while the container runs. SIF also allows security artifacts such as digital signatures and software bill of materials (SBOM) to be stored directly within the SIF image. If we’re running containers on an air-gap system, or we don’t have access to the relevant registry at the point in time where we want to work with these security artifacts, we can do so easily and efficiently.
At this point, you might be thinking “Well this all sounds promising, but I don’t have time to learn a new tool.” That’s totally fair! And that’s why the Singularity community has made it easy to get started with Singularity. There is compatibility with registries such as DockerHub, which means that getting started can be as easy as executing your existing image with a single Singularity command. Sylabs also hosts a cloud service at cloud.sylabs.io which makes it easy to build, share and discover Singularity containers.
Hopefully this has been a useful overview of HPC, and how software containers are used in supercomputing environments. If you have any questions or comments about HPC, feel free to reach out via firstname.lastname@example.org.
Thanks, and see you next time!