Installing the NVIDIA GPU Driver and CUDA Toolkit on a P1 BMS

Scenarios

After a GPU-accelerated P1 BMS (using the physical.p1.large flavor) is created, the NVIDIA GPU driver and CUDA Toolkit must be installed on it for computing acceleration.

Prerequisites

CentOS 7.4

  1. Log in to the target BMS and run the following command to switch to user root:

    su root

  2. (Optional) If the gcc, gcc-c++, make, and kernel-devel dependency packages do not exist, run the following commands to install the gcc, gcc-c++, make, and kernel-devel tools:

    yum install gcc

    yum install gcc-c++

    yum install make

    yum install kernel-devel-`uname -r`

  3. (Optional) Add the Nouveau driver to the blacklist.

    If the Nouveau driver has been installed and loaded, perform the following operations to add the Nouveau driver to the blacklist to avoid conflicts:

    1. Add blacklist nouveau to the end of the /etc/modprobe.d/blacklist.conf file.

    2. Run the following commands to back up and reconstruct initramfs:

      mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak

      dracut -v /boot/initramfs-$(uname -r).img $(uname -r)

    3. Run the reboot command to restart the BMS.

  4. (Optional) If the X service is running, run the systemctl set-default multi-user.target command and restart the BMS to enter multi-user mode.

  5. (Optional) Install the NVIDIA GPU driver.

    If you selected a specified version of NVIDIA GPU driver rather than a version contained in the CUDA Toolkit, perform this step.

    1. Download NVIDIA GPU driver installation package NVIDIA-Linux-x86_64-xxx.yy.run from https://www.nvidia.com/Download/index.aspx?lang=en, and upload this package to the /tmp directory on the BMS.

      **Figure 1** Searching for the NVIDIA GPU driver package (CentOS 7.4)

      Figure 1 Searching for the NVIDIA GPU driver package (CentOS 7.4)

    2. Run the following command to install the NVIDIA GPU driver:

      sh ./NVIDIA-Linux-x86_64-xxx.yy.run

    3. Run the following command to delete the installation package:

      rm -f NVIDIA-Linux-x86_64-xxx.yy.run

  6. Install the CUDA Toolkit.

    1. Download CUDA Toolkit installation package cuda_a.b.cc_xxx.yy_linux.run from https://developer.nvidia.com/cuda-downloads, and upload this package to the /tmp directory on the BMS.

    2. Run the following command to change the permission to the installation package:

      chmod +x cuda_a.b.cc_xxx.yy_linux.run

    3. Run the following command to install the CUDA Toolkit:

      ./cuda_a.b.cc_xxx.yy_linux.run -toolkit -samples -silent -override --tmpdir=/tmp/

    4. Run the following command to delete the installation package:

      rm -f cuda_a.b.cc_xxx.yy_linux.run

    5. Run the following commands to check whether the installation is successful:

      cd /usr/local/cuda/samples/1_Utilities/deviceQueryDrv/

      make

      ./deviceQueryDrv

      If the command output contains "Result = PASS", the CUDA Toolkit and the NVIDIA GPU driver have been installed successfully.