PyTorch

Cannot install PyTorch. Why not compiling it from source?

In most cases, you can install PyTorch via Pip or Conda in one line. However, you might want to compile it from the source if:

  1. You want to change the source code.

  2. You want to enable more features of PyTorch (ex. MPI support).

  3. Your server has ARM CPUs and you cannot install via Pip / Conda.

Let's dive into the details.

CUDA

The most important preparation is to set the CUDA environment correctly if you want to use the Nvidia GPUs. You can follow the guidelines on this page to set the CUDA environment up.

Enable MPI Backend for PyTorch (Optional)

PyTorch does not support MPI by default. To compile PyTorch with MPI support, you need to have an MPI compiler (of course). Yet, the default MPI compiler might not be CUDA-aware, so you might also want to enable that.

I will use OpenMPI as an example here to show how you can compile MPI from the source to enable CUDA-aware communication. All the libraries will be installed in the ~/local directory so let's create it first.

# create installation path
export MPI_HOME=${HOME}/local # SET THIS TO WHERE YOU WANT TO INSTALL MPI
mkdir -p ${MPI_HOME}
cd /tmp

1. GDRCOPY (optional, requires sudo)

If your Nvidia GPUs support RDMA (V100, A100, H100, etc), it is recommended to use gdrcopy to achieve the best performance for MPI.

# download gdrcopy
wget https://github.com/NVIDIA/gdrcopy/archive/refs/tags/v2.4.1.tar.gz
# decompress gdrcopy
tar -xvf v2.4.1.tar.gz

# compile and install gdrcopy
cd gdrcopy-2.4.1/ 
make prefix=$MPI_HOME CUDA=$CUDA_HOME all install
sudo ./insmod.sh

2. UCX

# download ucx
wget https://github.com/openucx/ucx/releases/download/v1.17.0/ucx-1.17.0.tar.gz

# decompress ucx
tar -xvf ucx-1.17.0.tar.gz
cd ucx-1.17.0

# Configure ucx with gdrcopy and cuda support
# ./contrib/configure-release --prefix=${MPI_HOME} --with-gdrcopy=${MPI_HOME} --with-cuda=${CUDA_HOME}

# If you cannot use gdrcopy
./contrib/configure-release --prefix=${MPI_HOME} --with-cuda=${CUDA_HOME}

# Compile and Install
make -j && make install

3. OpenMPI

wget https://download.open-mpi.org/release/open-mpi/v5.0/openmpi-5.0.5.tar.gz
tar -xvf openmpi-5.0.5.tar.gz
cd openmpi-5.0.5
./configure --with-cuda=${CUDA_HOME} --with-ucx=${MPI_HOME} --prefix=${MPI_HOME}
make -j && make install

4. Setting Environment Variables

Add the environment variables to your ~/.bashrc file and enable them.

# adding the following lines to the ~/.bashrc file and restart the terminal
echo "export MPI_HOME=${MPI_HOME}" >> ~/.bashrc
echo 'export CPATH="${MPI_HOME}"/include:"${CPATH}"' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH="${MPI_HOME}"/lib64:"${LD_LIBRARY_PATH}"' >> ~/.bashrc
echo 'export PATH="${MPI_HOME}"/bin:"${PATH}"' >> ~/.bashrc
source ~/.bashrc

Conda

Installing Conda is straightforward. Find the download link that works for you on this page.

# For x86 PC:
CONDA_INSTALL_DIR=~/miniconda3 # change this to your preferred directory
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh -b -u -p $CONDA_INSTALL_DIR
source $CONDA_INSTALL_DIR/bin/activate
conda init bash
source ~/.bashrc
# For ARM: 
CONDA_INSTALL_DIR=~/miniconda3 # change this to your preferred directory
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-aarch64.sh
bash Miniconda3-latest-Linux-aarch64.sh -b -u -p $CONDA_INSTALL_DIR
source $CONDA_INSTALL_DIR/bin/activate
conda init bash
source ~/.bashrc

PyTorch

The final step is to compile PyTorch from the source.

# Download PyTorch
cd /tmp
wget https://github.com/pytorch/pytorch/releases/download/v2.4.1/pytorch-v2.4.1.tar.gz
tar -xvf pytorch-v2.4.1.tar.gz
cd pytorch-v2.4.1

# Optional create conda environment to install pytorch
conda create -n torch python=3.12
conda activate torch

# Install dependencies
pip install -r requirements.txt

# Optional: install openblas
conda install -c anaconda openblas

# Optional configure interface
export _GLIBCXX_USE_CXX11_ABI=1

# Configure installation path for PyTorch
export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname $(which conda))/../"}

# Configure dependencies
export USE_CUDA=1
export USE_NCCL=1

# Optional Enable MPI
# export USE_MPI=1

# Optional Enable CUDNN if you have installed it
# export USE_CUDNN=1

# Configure PyTorch
python setup.py build --cmake-only

# build and install
python setup.py install

That's it :)

Last updated