NVidia CUDA inside a LXD container

Stéphane Graber

Stéphane Graber

on 28 March 2017

LXD logo

GPU inside a container

LXD supports GPU passthrough but this is implemented in a very different way than what you would expect from a virtual machine. With containers, rather than passing a raw PCI device and have the container deal with it (which it can’t), we instead have the host setup with all needed drivers and only pass the resulting device nodes to the container.

This post focuses on NVidia and the CUDA toolkit specifically, but LXD’s passthrough feature should work with all other GPUs too. NVidia is just what I happen to have around.

The test system used below is a virtual machine with two NVidia GT 730 cards attached to it. Those are very cheap, low performance GPUs, that have the advantage of existing in low-profile PCI cards that fit fine in one of my servers and don’t require extra power.
For production CUDA workloads, you’ll want something much better than this.

Host setup

Install the CUDA tools and drivers on the host:

wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
sudo apt update
sudo apt install cuda

Then reboot the system to make sure everything is properly setup. After that, you should be able to confirm that your NVidia GPU is properly working with:

ubuntu@canonical-lxd:~$ nvidia-smi 
Tue Mar 21 21:28:34 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39                 Driver Version: 375.39                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 730      Off  | 0000:02:06.0     N/A |                  N/A |
| 30%   30C    P0    N/A /  N/A |      0MiB /  2001MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GT 730      Off  | 0000:02:08.0     N/A |                  N/A |
| 30%   26C    P0    N/A /  N/A |      0MiB /  2001MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0                  Not Supported                                         |
|    1                  Not Supported                                         |
+-----------------------------------------------------------------------------+

And can check that the CUDA tools work properly with:

ubuntu@canonical-lxd:~$ /usr/local/cuda-8.0/extras/demo_suite/bandwidthTest
[CUDA Bandwidth Test] - Starting...
Running on...

 Device 0: GeForce GT 730
 Quick Mode

 Host to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			3059.4

 Device to Host Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			3267.4

 Device to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			30805.1

Result = PASS

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Container setup

First lets just create a regular Ubuntu 16.04 container:

ubuntu@canonical-lxd:~$ lxc launch ubuntu:16.04 cuda
Creating cuda
Starting cuda

Then install the CUDA demo tools in there:

lxc exec cuda -- wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
lxc exec cuda -- dpkg -i cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
lxc exec cuda -- apt update
lxc exec cuda -- apt install cuda-demo-suite-8-0 --no-install-recommends

At which point, you can run:

ubuntu@canonical-lxd:~$ lxc exec cuda -- nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

Which is expected as LXD hasn’t been told to pass any GPU yet.

LXD GPU passthrough

LXD allows for pretty specific GPU passthrough, the details can be found here.
First let’s start with the most generic one, just allow access to all GPUs:

ubuntu@canonical-lxd:~$ lxc config device add cuda gpu gpu
Device gpu added to cuda
ubuntu@canonical-lxd:~$ lxc exec cuda -- nvidia-smi
Tue Mar 21 21:47:54 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39                 Driver Version: 375.39                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 730      Off  | 0000:02:06.0     N/A |                  N/A |
| 30%   30C    P0    N/A /  N/A |      0MiB /  2001MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GT 730      Off  | 0000:02:08.0     N/A |                  N/A |
| 30%   27C    P0    N/A /  N/A |      0MiB /  2001MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0                  Not Supported                                         |
|    1                  Not Supported                                         |
+-----------------------------------------------------------------------------+
ubuntu@canonical-lxd:~$ lxc config device remove cuda gpu
Device gpu removed from cuda

Now just pass whichever is the first GPU:

ubuntu@canonical-lxd:~$ lxc config device add cuda gpu gpu id=0
Device gpu added to cuda
ubuntu@canonical-lxd:~$ lxc exec cuda -- nvidia-smi
Tue Mar 21 21:50:37 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39                 Driver Version: 375.39                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 730      Off  | 0000:02:06.0     N/A |                  N/A |
| 30%   30C    P0    N/A /  N/A |      0MiB /  2001MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0                  Not Supported                                         |
+-----------------------------------------------------------------------------+
ubuntu@canonical-lxd:~$ lxc config device remove cuda gpu
Device gpu removed from cuda

You can also specify the GPU by vendorid and productid:

ubuntu@canonical-lxd:~$ lspci -nnn | grep NVIDIA
02:06.0 VGA compatible controller [0300]: NVIDIA Corporation GK208 [GeForce GT 730] [10de:1287] (rev a1)
02:07.0 Audio device [0403]: NVIDIA Corporation GK208 HDMI/DP Audio Controller [10de:0e0f] (rev a1)
02:08.0 VGA compatible controller [0300]: NVIDIA Corporation GK208 [GeForce GT 730] [10de:1287] (rev a1)
02:09.0 Audio device [0403]: NVIDIA Corporation GK208 HDMI/DP Audio Controller [10de:0e0f] (rev a1)
ubuntu@canonical-lxd:~$ lxc config device add cuda gpu gpu vendorid=10de productid=1287
Device gpu added to cuda
ubuntu@canonical-lxd:~$ lxc exec cuda -- nvidia-smi
Tue Mar 21 21:52:40 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39                 Driver Version: 375.39                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 730      Off  | 0000:02:06.0     N/A |                  N/A |
| 30%   30C    P0    N/A /  N/A |      0MiB /  2001MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GT 730      Off  | 0000:02:08.0     N/A |                  N/A |
| 30%   27C    P0    N/A /  N/A |      0MiB /  2001MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0                  Not Supported                                         |
|    1                  Not Supported                                         |
+-----------------------------------------------------------------------------+
ubuntu@canonical-lxd:~$ lxc config device remove cuda gpu
Device gpu removed from cuda

Which adds them both as they are exactly the same model in my setup.

But for such cases, you can also select using the card’s PCI ID with:

ubuntu@canonical-lxd:~$ lxc config device add cuda gpu gpu pci=0000:02:08.0
Device gpu added to cuda
ubuntu@canonical-lxd:~$ lxc exec cuda -- nvidia-smi
Tue Mar 21 21:56:52 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39                 Driver Version: 375.39                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 730      Off  | 0000:02:08.0     N/A |                  N/A |
| 30%   27C    P0    N/A /  N/A |      0MiB /  2001MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0                  Not Supported                                         |
+-----------------------------------------------------------------------------+
ubuntu@canonical-lxd:~$ lxc config device remove cuda gpu 
Device gpu removed from cuda

And lastly, lets confirm that we get the same result as on the host when running a CUDA workload:

ubuntu@canonical-lxd:~$ lxc config device add cuda gpu gpu
Device gpu added to cuda
ubuntu@canonical-lxd:~$ lxc exec cuda -- /usr/local/cuda-8.0/extras/demo_suite/bandwidthTest
[CUDA Bandwidth Test] - Starting...
Running on...

 Device 0: GeForce GT 730
 Quick Mode

 Host to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			3065.4

 Device to Host Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			3305.8

 Device to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			30825.7

Result = PASS

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Conclusion

LXD makes it very easy to share one or multiple GPUs with your containers.
You can either dedicate specific GPUs to specific containers or just share them.

There is no of the overhead involved with usual PCI based passthrough and only a single instance of the driver is running with the containers acting just like normal host user processes would.

This does however require that your containers run a version of the CUDA tools which supports whatever version of the NVidia drivers is installed on the host.

Extra information

The main LXD website is at: https://linuxcontainers.org/lxd
Development happens on Github at: https://github.com/lxc/lxd
Mailing-list support happens on: https://lists.linuxcontainers.org
IRC support happens in: #lxcontainers on irc.freenode.net
Try LXD online: https://linuxcontainers.org/lxd/try-it
Original article

Ubuntu cloud

Ubuntu offers all the training, software infrastructure, tools, services and support you need for your public and private clouds.

Sign up for email updates

Choose the topics you're interested in

 

Related posts

LXD Clusters: A Primer

Since its inception, LXD has been striving to offer a fresh and intuitive user experience for machine containers. LXD instances can be managed over the network through a REST API and a single command line tool. For large scale LXD…

LXD weekly status #44

Introduction Another week of bugfixes for us as more and more people update to the 3.0 releases! Quite a bit of work went into improving the handling of the two database in LXD 3.0, making it easier for us to debug issues and provide fixes…

LXD weekly status #43

Introduction This week’s focus was on bugfixes with a good number of clustering related fixes and improvements as well as some tweaks and fixes to other recently added features. On the feature development front, the current focus is on…