Docker Container
Typical software stack
My Code
Tensorflow, PyTorch, Frameworks + Library Dependencies
Python
CPU ML libraries
Hardware Accelator
AI accelerator ML libraries
AI accelerator drivers
OS
AI accelerator drivers : with matching versions
OS Kernel
Host OS
Heterogeneous Hardware
Duplicating drivers = bloated VMs and containers
Hardware driver versions must match
Not portable (whole point of containers). difficult to scale
Very brittle solution
runc/libcontainer/process_linux.go
func (p * initProcess ) start () (retErr error ) {
ierr := parseSync (p .comm .syncSockParent , func (sync * syncT ) error {
switch sync .Type {
case procHooks :
if p .config .Config .HasHook (configs .Prestart , configs .CreateRuntime ) {
if err := hooks .Run (configs .Prestart , s ); err != nil {
return err
}
/etc/docker/daemon.json
/etc/nvidia-container-runtime/config.toml
{
"runtimes" : {
"nvidia" : {
"args" : [],
"path" : " nvidia-container-runtime"
}
}
}
docker run --rm -it --gpus all nvcr.io/nvidia/tritonserver:25.01-py3 bash
ls -Fl /dev | grep nvidia
crw-rw-rw- 1 root root 511, 0 Mar 3 03:09 nvidia-uvm
crw-rw-rw- 1 root root 511, 1 Mar 3 03:09 nvidia-uvm-tools
crw-rw-rw- 1 root root 195, 0 Mar 3 03:08 nvidia0
crw-rw-rw- 1 root root 195, 255 Mar 3 03:08 nvidiactl
nvidia-smi
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.86.15 Driver Version: 570.86.15 CUDA Version: 12.8 |
| -----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
| =========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3070 Off | 00000000:2B:00.0 On | N/A |
| 0% 50C P3 49W / 270W | 1256MiB / 8192MiB | 21% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+