Skip to content

Commit

Permalink
Bump dcgm exporter version to correctly capture GPU utilization
Browse files Browse the repository at this point in the history
  • Loading branch information
Verdi March committed Mar 14, 2024
1 parent 946c45f commit 5a331d9
Showing 1 changed file with 3 additions and 3 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -4,18 +4,18 @@
if nvidia-smi; then
echo "NVIDIA GPU found. Proceeding with script..."
# Set DCGM Exporter version
DCGM_EXPORTER_VERSION=2.1.4-2.3.1
DCGM_EXPORTER_VERSION=3.3.5-3.4.0-ubuntu22.04

# Run the DCGM Exporter Docker container
sudo docker run -d --rm \
--gpus all \
--net host \
--cap-add SYS_ADMIN \
nvcr.io/nvidia/k8s/dcgm-exporter:${DCGM_EXPORTER_VERSION}-ubuntu20.04 \
nvcr.io/nvidia/k8s/dcgm-exporter:${DCGM_EXPORTER_VERSION} \
-f /etc/dcgm-exporter/dcp-metrics-included.csv || { echo "Failed to run DCGM Exporter Docker container"; exit 1; }

echo "Running DCGM exporter in a Docker container on port 9400..."
else
echo "NVIDIA GPU not found. DCGM Exporter was not installed. If this is controller node, you can safelly ignore this warning. Exiting gracefully..."
exit 0
fi
fi

0 comments on commit 5a331d9

Please sign in to comment.