Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify additional steps to utilize GPU for Linux users #2299

Merged
merged 15 commits into from
Sep 5, 2024
151 changes: 149 additions & 2 deletions site/en/install/pip.md
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,55 @@ The following NVIDIA® software are only required for GPU support.
nvidia-smi
```

### 3. Install TensorFlow
### 3. Install Miniconda

You can skip this section if you have already installed `Miniconda` (referred as *option #1* in the next steps) or you prefer to use Python’s built-in `venv` module (referred as *option #2* in the next steps) instead.

[Miniconda](https://docs.conda.io/en/latest/miniconda.html){:.external}
is the recommended approach for installing TensorFlow with GPU support.
It creates a separate environment to avoid changing any installed
software in your system. This is also the easiest way to install the
required software especially for the GPU setup.

Follow the instuctions of the conda user guide to install miniconda
[Miniconda Installation Guide](https://conda.io/projects/conda/en/latest/user-guide/install/linux.html){:.external}.

### 4. Create a virtual environment

* ***Option #1: Miniconda***

Create a new conda environment named `tf` with the following command.

```bash
conda create --name tf python=3.11
```
You can activate and deactivate it with the following commands.

```bash
conda activate tf
conda deactivate
```

* ***Option #2: venv***

The [venv](https://docs.python.org/3/library/venv.html){:.external} module supports creating lightweight “virtual environments”, each with their own independent set of Python packages installed in their site directories.

Navigate to your desired virtual environments directory and create a new venv environment named `tf` with the following command.

```bash
python3 -m venv tf
```

You can activate and deactivate it with the following commands.

```bash
source tf/bin/activate
deactivate

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove deactivate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove deactivate?

@learning-to-play removed deactivate as advised. Furthermore, I could remove the instruction to create symlink to ptxas since it is ultimately not needed for TensorFlow version 2.17.0.rc0 but only for TensorFlow version 2.16.1. Awaiting your comments.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to make sure that I understand the situation correctly. Which of the following two situation is correct?

Copy link
Contributor Author

@sgkouzias sgkouzias Jun 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@learning-to-play the only difference is that on version 2.17.0.rc0 you need to create the symlinks to NVIDIA libs in order to utilize GPUs while on version 2.16.1 you should in addition to creating symlinks to NVIDIA libs create a symlink to ptxas as well. Consequently, the command pip install tensorflow[and-cuda] alone fails to work with GPUs on both versions.

```

Make sure that the virtual environment is activated for the rest of the installation.

### 5. Install TensorFlow

TensorFlow requires a recent version of pip, so upgrade your pip
installation to be sure you're running the latest version.
Expand All @@ -211,7 +259,106 @@ The following NVIDIA® software are only required for GPU support.
pip install tensorflow
```

### 4. Verify the installation
Note: Do not install TensorFlow with `conda`. It may not have the latest stable version. `pip` is recommended since TensorFlow is only officially released to PyPI.

### 6. Set environment variables

You can skip this section if you only run TensorFlow on the CPU.

* ***Option #1: Miniconda***

Locate the directory for the conda environment in your terminal window by running in the terminal:
`echo $CONDA_PREFIX`

Enter that directory and create these subdirectories and files:

```bash
cd $CONDA_PREFIX
mkdir -p ./etc/conda/activate.d
mkdir -p ./etc/conda/deactivate.d
touch ./etc/conda/activate.d/env_vars.sh
touch ./etc/conda/deactivate.d/env_vars.sh
```
Edit `./etc/conda/activate.d/env_vars.sh` as follows:

```bash
#!/bin/sh

# Store original LD_LIBRARY_PATH
export ORIGINAL_LD_LIBRARY_PATH="${LD_LIBRARY_PATH}"

# Get the CUDNN directory
CUDNN_DIR=$(dirname $(dirname $(python -c "import nvidia.cudnn; print(nvidia.cudnn.__file__)")))

# Set LD_LIBRARY_PATH to include CUDNN directory
export LD_LIBRARY_PATH=$(find ${CUDNN_DIR}/*/lib/ -type d -printf "%p:")${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

# Get the ptxas directory
PTXAS_DIR=$(dirname $(dirname $(python -c "import nvidia.cuda_nvcc; print(nvidia.cuda_nvcc.__file__)")))

# Set PATH to include the directory containing ptxas
export PATH=$(find ${PTXAS_DIR}/*/bin/ -type d -printf "%p:")${PATH:+:${PATH}}
```
Edit `./etc/conda/deactivate.d/env_vars.sh` as follows:

```bash
#!/bin/sh

# Restore original LD_LIBRARY_PATH
export LD_LIBRARY_PATH="${ORIGINAL_LD_LIBRARY_PATH}"

# Unset environment variables
unset CUDNN_DIR
unset PTXAS_DIR
```
* ***Option #2: venv***

Locate the directory for the venv environment in your terminal window by running in the terminal:
`echo $VIRTUAL_ENV`

Enter that directory and add the following lines at the end of the activate script `./bin/activate` as follows:

```bash
# Store original LD_LIBRARY_PATH
export ORIGINAL_LD_LIBRARY_PATH=$LD_LIBRARY_PATH

# Get the CUDNN directory
CUDNN_DIR=$(dirname $(dirname $(python -c "import nvidia.cudnn; print(nvidia.cudnn.__file__)")))

# Set LD_LIBRARY_PATH to include CUDNN directory
export LD_LIBRARY_PATH=$(find ${CUDNN_DIR}/*/lib/ -type d -printf "%p:")${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

# Get the ptxas directory
PTXAS_DIR=$(dirname $(dirname $(python -c "import nvidia.cuda_nvcc; print(nvidia.cuda_nvcc.__file__)")))

# Set PATH to include the directory containing ptxas
export PATH=$(find ${PTXAS_DIR}/*/bin/ -type d -printf "%p:")${PATH:+:${PATH}}
```

Add the following lines at the end of `deactivate` block in the activate script to ensure that the necessary NVIDIA environment variables are set only while the virtual environment is active:

```bash
deactivate () {
# ...
# Unset the added path to PATH if within the virtual environment
if [ -n "$VIRTUAL_ENV" ]; then
# Remove the path from PATH
PATH=$(echo $PATH | sed -e "s|${PTXAS_DIR}/*/bin/:||g")
fi

# Restore original LD_LIBRARY_PATH
if [ -n "$ORIGINAL_LD_LIBRARY_PATH" ]; then
export LD_LIBRARY_PATH=$ORIGINAL_LD_LIBRARY_PATH
unset ORIGINAL_LD_LIBRARY_PATH
fi

# Unset environment variables
unset CUDNN_DIR
unset PTXAS_DIR
}
```

### 7. Verify the installation

Verify the CPU setup:

Expand Down