Skip to main content

2 posts tagged with "cuda"

View All Tags

ยท 4 min read
German P. Barletta

In the previous post we reviewed a simple Singularity definition file for a container that gave support for CUDA development.

We'll now see an example that uses the CUDA toolkit runtime and a conda environment, or more precisely, mambaforge which includes mamba as a conda replacement and conda-forge as the default channel.

mamba works almost identically to conda and it's an order of magnitude faster. I think many of us would've dropped conda as a package manager and virtual environment if it wasn't for mamba's solver.

Writing the definition file

This is the Singularity definition file I used to containerize the locuaz optimization protocol. We'll skip the details about locuaz, suffice to say that it's an antibody optimization protocol that carries a lot of dependencies, some of them cannot be installed through pip, others cannot be installed through conda so we end up using both, which is not ideal. This is precisely why I think trying to containerize it is a good challenge.

Let's check the .def definition file:

Bootstrap: docker

export LC_ALL=C
export DEBIAN_FRONTEND=noninteractive
apt update
apt install -y wget libopenmpi-dev
mkdir /opt/concept
cd /opt/concept
mv ../usr_deps.yaml ./
bash -p /opt/mambaforge -b
. /opt/mambaforge/bin/activate
mamba env create -f usr_deps.yaml
conda activate concept
pip install locuaz --root-user-action=ignore


export LC_ALL=C
source /opt/mambaforge/bin/activate /opt/mambaforge/envs/concept

usr_deps.yaml opt/

Some explanations about the non-obvious lines:

  1. We install wget to download the mambaforge installer script and libopenmpi-dev since locuaz uses MPI to launch multiple GROMACS MD runs.
  2. We run the mambaforge installer with the -b flag to skip the license agreement question and -p to specify the install dir.
  3. . /opt/mambaforge/bin/activate to activate conda and then use the mamba executable to build the environment. The yaml file was included with the container.
  4. And after creating and activating the environment, we install locuaz with a flag that was added to pip for the specific case of containerized builds, where we usually are the root user: --root-user-action=ignore, to silence the pip warning coming from installing with root privileges

Now, the major pain point when including mambaforge is the activation of the environment.

You can't run source /root/.bashrc after installing mambaforge since source is note available during the execution of %post. You can't run conda init or mamba init either, since it'll ask you to restart your shell.

The solution is to run the activation script in a way any UNIX system should support, that is, using the syntax: . script. Then on %environment we get a full bash interpreter and we can source the activation script and point it to the folder where our environment resides: source /opt/mambaforge/bin/activate /opt/mambaforge/envs/<your_environment>.

Finally, we build it:

sudo singularity build locuaz.sif locuaz.def

Actually running it

There's another obstacle when running from a container, and this is the binding of host directories. Singularity includes some host dirs by default, but if your containerized workflow needs additional access, you'll need to include it with the --bind flag.

In our case, it's GROMACS that needs many additional locations. This is how the singularity call command ends up looking looks on my machine:

singularity exec --nv --bind /usr/local/gromacs,/lib/x86_64-linux-gnu,/usr/local/cuda-12.2/lib64,/etc/alternatives locuaz.sif locuaz daux/config_ligand.yaml 

Notice the --nv flag to be able to run with GPU support and how we include a plethora of comma separated directories, after the --bind flag. After locuaz.sif, our actual container, we call the locuaz program with a configuration file as argument.

The /usr/local/gromacs directory is where the GROMACS installations resides. The rest of the directories are the locations of the libraries that the gmx binary links to. These will be specific to each machine, but you'll easily find them by checking which libraries does the gmx binary link to.

For example, on my machine:

ldd `which gmx` (0x00007ffc8817a000) => /usr/local/gromacs/lib/ (0x00007f74c0200000) => /lib/x86_64-linux-gnu/ (0x00007f74bfe00000) => /lib/x86_64-linux-gnu/ (0x00007f74c5417000) => /lib/x86_64-linux-gnu/ (0x00007f74bfa00000) => /usr/local/cuda-12.2/lib64/ (0x00007f74b4c00000) => /lib/x86_64-linux-gnu/ (0x00007f74b27b0000) => /lib/x86_64-linux-gnu/ (0x00007f74c532e000) => /lib/x86_64-linux-gnu/ (0x00007f74c52da000) => /usr/local/gromacs/lib/ (0x00007f74c017d000)
/lib64/ (0x00007f74c5471000) => /lib/x86_64-linux-gnu/ (0x00007f74c52d5000) => /lib/x86_64-linux-gnu/ (0x00007f74c52d0000) => /lib/x86_64-linux-gnu/ (0x00007f74c0178000) => /lib/x86_64-linux-gnu/ (0x00007f74b2400000)

If you wish to make the run line shorter, you can store these locations on the dedicated environment variable $SINGULARITY_BIND:

export SINGULARITY_BIND="/usr/local/gromacs,/lib/x86_64-linux-gnu,/usr/local/cuda-12.2/lib64,/etc/alternatives/"

And then just run:

singularity exec --nv locuaz.sif locuaz daux/config_ligand.yaml

This does make everything a bit uglier for the user and I'd argue that creating a conda environment and running pip is not that much harder, but hey, some people love containers.

Hopefully in the future I'll get to benchmark the containerized version of the protocol, but I don't expect significative slowdowns.

ยท 2 min read
German P. Barletta

Here's a simple starting file that's based on a CUDA 11 docker image for ubuntu 22.04. It includes the file from cuda-samples/Samples/0_Introduction/vectorAdd and some header files from cuda-samples/Common. These sample files used to come with the CUDA toolkit, but now they have to be downloaded from a repo.

Bootstrap: docker

export DEBIAN_FRONTEND=noninteractive
apt update
cd /opt
nvcc -ccbin g++ -m64 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -o local_vectorAdd -I./


export LC_ALL=C

helper_cuda.h /opt/
helper_string.h /opt/ /opt/

Some explanations about the non-obvious lines:

  1. export DEBIAN_FRONTEND=noninteractive is an almost mandatory line in all containers. It prevents the system from expecting user input, which in our case would hald the container build.
  2. export LC_ALL=C: so Perl doesn't complain about localization if we launch the container as a shell.
  3. We're asking nvcc to generate PTX and SASS for all currently supported architectures.

After saving this into the definition file singu.def and building it on my machine:

$ sudo singularity build singu.sif singu.def

I uploaded it onto Leonardo, which is a Red Hat 8.6 system, and ran it:

(base) [pbarlett@lrdn3433 ~]$ singularity run --nv singu.sif 
INFO: Converting SIF file to temporary sandbox...
WARNING: underlay of /etc/localtime required more than 50 (76) bind mounts
WARNING: underlay of /usr/bin/nvidia-smi required more than 50 (387) bind mounts
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
INFO: Cleaning up image...

GLIBC versions between my computer (2.35) and Leonardo's (2.28) also differ, so we can only hope we don't run into any issues later on.