PyTorch

Cannot install PyTorch. Why not compiling it from source?

In most cases, you can install PyTorch via Pip or Conda in one line. However, you might want to compile it from the source if:

  1. You want to change the source code.

  2. You want to enable more features of PyTorch (ex. MPI support).

  3. Your server has ARM CPUs and you cannot install via Pip / Conda.

Let's dive into the details.

CUDA

The most important preparation is to set the CUDA environment correctly if you want to use the Nvidia GPUs. You can follow the guidelines on this page to set the CUDA environment up.

Enable MPI Backend for PyTorch (Optional)

PyTorch does not support MPI by default. To compile PyTorch with MPI support, you need to have an MPI compiler (of course). Yet, the default MPI compiler might not be CUDA-aware, so you might also want to enable that.

I will use OpenMPI as an example here to show how you can compile MPI from the source to enable CUDA-aware communication. All the libraries will be installed in the ~/local directory so let's create it first.

# create installation path
export MPI_HOME=${HOME}/local # SET THIS TO WHERE YOU WANT TO INSTALL MPI
mkdir -p ${MPI_HOME}
cd /tmp

1. GDRCOPY (optional, requires sudo)

If your Nvidia GPUs support RDMA (V100, A100, H100, etc), it is recommended to use gdrcopy to achieve the best performance for MPI.

2. UCX

3. OpenMPI

4. Setting Environment Variables

Add the environment variables to your ~/.bashrc file and enable them.

Conda

Installing Conda is straightforward. Find the download link that works for you on this pagearrow-up-right.

PyTorch

The final step is to compile PyTorch from the source.

That's it :)

Last updated