Installing TensorFlow with GPU support using Conda.

The easiest way to install TensorFlow with GPU support is using Conda. This will yield a Python environment, located in your $WORK directory, with a GPU-enabled version of the TensorFlow package.

First, you need to open a terminal connection to a node with a GPU. The following command will do this:

srun --partition=2080-galvani --gres=gpu:1 --pty bash

Now you are in a terminal window on a compute node. On the new terminal on the compute node, run the following commands. First, you are creating a new Conda environment with Python version 3.11 (the most recent one that supports TensorFlow with GPU) Second, you are activating that environment so that you can run commands within it. Third, you are installing the TensorFlow package with CUDA/GPU support.

conda create -p $WORK/.conda/py-311-tf python=3.11
conda activate $WORK/.conda/py-311-tf
python3 -m pip install tensorflow[and-cuda]

If you need other Python packages in your TensorFlow environment, you can install them with python3 -m pip install $PACKAGE_NAME. To find which packages are available, you must search the PyPi website.

If you want to check that this worked, just after running the three commands above, run these two commands at the terminal.

python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
# Should return some error messages about factories and TensorRT then 
# [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

python3 -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
# Same or similar messages and then:
# [...] Created device /job:localhost/replica:0/task:0/device:GPU:0 [...]
# tf.Tensor([...])

Using TensorFlow with GPU support

To use TensorFlow with GPU support, run the following in your sbatch script:

conda activate $WORK/.conda/py-311-tf

This will activate the Conda environment for TensorFlow (plus whatever other packages you have in that environment).