CUDA
This tutorial provides guidelines for setting up a CUDA environment on a Linux machine. Additionally, this tutorial attempts to use as little sudo as possible.
Setting up a CUDA environment for C++ development can be challenging for beginners. Yet, it is also rewarding as it will allow you to do lots of cool things like customize PyTorch and TensorFlow.
This tutorial provides a step-by-step guide for setting up a CUDA environment on a Linux machine and a Nvidia GPU. Note: The installation process will fail if you don't have a supported compiler. Make sure you have gcc, g++, ninja, and cmake installed before you install the CUDA driver.
Install CUDA Driver (Requires Sudo)
First, download the driver run file from CUDA Toolkit Downloads and install the driver accordingly.
You need to have sudo privileges to install the driver.
# Download the installation scripts
wget https://developer.download.nvidia.com/compute/cuda/12.6.2/local_installers/cuda_12.6.2_560.35.03_linux.run
# Install Driver
sudo sh cuda_12.6.2_560.35.03_linux.run --silent --driver
# Optional install driver and toolkit
# sudo sh cuda_12.3.2_545.23.08_linux.run --silent --driver --toolkit
sudo rebootNote: The installation process will fail if you don't have a supported compiler. Make sure you have gcc, g++, ninja, and cmake installed before you install the CUDA driver.
Install CUDA Toolkit
Option I: CUDA Toolkit
I will use the CUDA 12.6.2 version as an example here.
# Download the installation scripts
wget https://developer.download.nvidia.com/compute/cuda/12.6.2/local_installers/cuda_12.6.2_560.35.03_linux.run
# Install Driver
CUDA_HOME=~/nvidia/cuda-12.6 # change this to the directory you want to install
sh cuda_12.6.2_560.35.03_linux.run --silent --toolkit --toolkitpath=${CUDA_HOME}Setting Up Environment Variables for CUDA Toolkit
echo "export CUDA_HOME=${CUDA_HOME}" >> ~/.bashrc
echo 'export CPATH="${CUDA_HOME}"/include:"${CPATH}"' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH="${CUDA_HOME}"/lib64:"${LD_LIBRARY_PATH}"' >> ~/.bashrc
echo 'export PATH="${CUDA_HOME}"/bin:"${PATH}"' >> ~/.bashrc
source ~/.bashrcOption II: HPC SDK
NVIDIA HPC SDK includes (almost) everything you need for CUDA program development. We will use it to install the CUDA Toolkit along with NCCL and cuBLAS etc.
I will use the 24.9 version as an example, and you can choose any version you want. You need to use a different URL to download the SDK if your CPU is not from Intel or AMD. Check the NVIDIA HPC SDK page for the details.
# x86
wget https://developer.download.nvidia.com/hpc-sdk/24.9/nvhpc_2024_249_Linux_x86_64_cuda_12.6.tar.gz
tar xpzf nvhpc_2024_249_Linux_x86_64_cuda_12.6.tar.gz
nvhpc_2024_249_Linux_x86_64_cuda_12.6/install# ARM
wget https://developer.download.nvidia.com/hpc-sdk/24.9/nvhpc_2024_249_Linux_aarch64_cuda_12.6.tar.gz
tar xpzf nvhpc_2024_249_Linux_aarch64_cuda_12.6.tar.gz
nvhpc_2024_249_Linux_aarch64_cuda_12.6/installSetting Up Environment Variables for NVHPC
Assuming you want to use CUDA Toolkit Version 12.6, the following code snippet contains the environment variables you want to set. You just need to add them to your ~/.bashrc file and restart your terminal to set the variables correctly.
# # >>> START CUDA ENV (HPC SDK) >>>
HPC_SDK_VERSION=24.9 # Set this to the NVHPC Tooklit version you installed
CUDA_VERSION=12.6 # Set this to the CUDA Tooklit version you installed
ISA=$(uname -i)
PLATFORM=$(uname)_$(uname -i)
export NVHPC=/opt/nvidia/hpc_sdk # SET THIS TO YOUR INSTALLED PATH
export CUDA_COMM_LIBS=${NVHPC}/${PLATFORM}/${HPC_SDK_VERSION}/comm_libs/${CUDA_VERSION}
export CUDA_MATH_LIBS=${NVHPC}/${PLATFORM}/${HPC_SDK_VERSION}/math_libs/${CUDA_VERSION}
# CUDA TOOLKIT
export CUDA_HOME=${NVHPC}/${PLATFORM}/${HPC_SDK_VERSION}/cuda/${CUDA_VERSION}
export NVHPC_ROOT=${CUDA_HOME}
export CUDA_TOOLKIT_ROOT_DIR=${CUDA_HOME}
export CUDA_TOOLKIT_ROOT=${CUDA_HOME}
export CUDA_PATH=${CUDA_HOME}
export PATH=${CUDA_HOME}/bin:${PATH}
export CPATH=${CUDA_HOME}/include:${CPATH}
export LD_LIBRARY_PATH=${CUDA_HOME}/lib64:${LD_LIBRARY_PATH}
export LD_LIBRARY_PATH=${CUDA_HOME}/extras/CUPTI/lib64:${LD_LIBRARY_PATH}
# MATH_LIBS (cuBLAS)
export CPATH=${CUDA_MATH_LIBS}/include:$CPATH
export LD_LIBRARY_PATH=${CUDA_MATH_LIBS}/lib64/:${LD_LIBRARY_PATH}
# NCCL Configuration
export USE_NCCL=1
export USE_SYSTEM_NCCL=1
export NCCL_HOME=${CUDA_COMM_LIBS}/nccl
export NCCL_PREFIX=${NCCL_HOME}
export NCCL_ROOT=${NCCL_HOME}
export NCCL_INCLUDE_DIR=${NCCL_HOME}/include
export NCCL_LIB_DIR=${NCCL_HOME}/lib
# NVShmem Configuration
export NVSHMEM_HOME=${CUDA_COMM_LIBS}/nvshmem
export NVSHMEM_PREFIX=${NVSHMEM_HOME}
export NVSHMEM_ROOT=${NVSHMEM_HOME}
export NVSHMEM_INCLUDE_DIR=${NVSHMEM_HOME}/include
export NVSHMEM_LIB_DIR=${NVSHMEM_HOME}/lib
# Compiler (not recommended, causing compilation error)
# CUDA_COMPILER_DIR=${NVHPC}/${PLATFORM}/${HPC_SDK_VERSION}/compilers/
# export MANPATH=$MANPATH:${NVHPC}/${PLATFORM}/${HPC_SDK_VERSION}/compilers/man
# export PATH=$PATH:$CUDA_COMPILER_DIR/bin
# export PATH=$PATH:$CUDA_COMPILER_DIR/bin/mpi/bin
# export PATH=$PATH:$CUDA_COMPILER_DIR/extras/qd/bin
# export CPATH=$CUDA_COMPILER_DIR/include:$CPATH
# export LD_LIBRARY_PATH=$CUDA_COMPILER_DIR/lib/:${LD_LIBRARY_PATH}
# export CC=$CUDA_COMPILER_DIR/bin/nvc
# export CXX=$CUDA_COMPILER_DIR/bin/nvc++
# export FC=$CUDA_COMPILER_DIR/bin/nvfortran
# export F90=$CUDA_COMPILER_DIR/bin/nvfortran
# export F77=$CUDA_COMPILER_DIR/bin/nvfortran
# export CPP=cpp
echo "Setting NVHPC HOME=$NVHPC"
# # >>> END CUDA ENV (HPC SDK) >>>Some libraries (ex. Caffe2) will look for dependent CUDA libraries inside the CUDA_TOOLKIT_ROOT directory. Let's create a symbolic link for these math libraries such that they can be found correctly. Note that you do not need to run these scripts if your toolkit is installed through CUDA Toolkit script.
# cuspare, cublas etc
ln -s ${CUDA_MATH_LIBS}/include/* ${CUDA_TOOLKIT_ROOT}/include/
ln -s ${CUDA_MATH_LIBS}/lib64/* ${CUDA_TOOLKIT_ROOT}/lib64/
ln -s ${CUDA_MATH_LIBS}/lib64/stubs/* ${CUDA_TOOLKIT_ROOT}/lib64/stubs/
# nccl
# ln -s ${CUDA_COMM_LIBS}/nccl/include/* ${CUDA_TOOLKIT_ROOT}/include/
# ln -s ${CUDA_COMM_LIBS}/nccl/lib/* ${CUDA_TOOLKIT_ROOT}/lib64/
# nvshemem
# ln -s ${CUDA_COMM_LIBS}/nvshmem/include/* ${CUDA_TOOLKIT_ROOT}/include/
# ln -s ${CUDA_COMM_LIBS}/nvshmem/lib/* ${CUDA_TOOLKIT_ROOT}/lib64/CUDNN (Optional)
CUDNN does not come with CUDA Toolkit or HPC SDK, and it needs to be installed separately. You can download the library from this website.
If your GPU is Ampere or Hopper architecture, follow the instructions on this website to install the latest version.
# x86
cd /tmp # or any download directory you prefer
wget https://developer.download.nvidia.com/compute/cudnn/redist/cudnn/linux-x86_64/cudnn-linux-x86_64-9.4.0.58_cuda12-archive.tar.xz
tar -xvf cudnn-linux-x86_64-9.4.0.58_cuda12-archive.tar.xz
# Copy the file into the CUDA toolkit directory
cp cudnn-*-archive/include/cudnn*.h ${CUDA_HOME}/include
cp -P cudnn-*-archive/lib/libcudnn* ${CUDA_HOME}/lib64
chmod a+r ${CUDA_HOME}/include/cudnn*.h ${CUDA_HOME}/lib64/libcudnn*# Arm64
wget https://developer.download.nvidia.com/compute/cudnn/redist/cudnn/linux-sbsa/cudnn-linux-sbsa-9.4.0.58_cuda12-archive.tar.xz
tar -xvf cudnn-linux-sbsa-9.4.0.58_cuda12-archive.tar.xz
# Copy the file into the CUDA toolkit directory
cp cudnn-*-archive/include/cudnn*.h ${CUDA_HOME}/include
cp -P cudnn-*-archive/lib/libcudnn* ${CUDA_HOME}/lib64
chmod a+r ${CUDA_HOME}/include/cudnn*.h ${CUDA_HOME}/lib64/libcudnn*Last updated