• Elastic Cloud Server

ecs
  1. Help Center
  2. Elastic Cloud Server
  3. User Guide
  4. ECS Instances
  5. (Optional) Installing a Driver and Toolkit
  6. Installing the NVIDIA GPU Driver and CUDA Toolkit on a P1 ECS

Installing the NVIDIA GPU Driver and CUDA Toolkit on a P1 ECS

Scenarios

After a P1 ECS is created, the NVIDIA driver must be installed on it for computing acceleration. For other types of ECSs, determine whether to install a driver according to the notes on using the ECSs in section ECS Types.

The procedure for installing the NVIDIA driver varies according to the OS. For details, see this section.

Prerequisites

  • The target ECS has had an EIP bound.
  • You have obtained the driver installation package required for an OS. For details, see Table 1.
Table 1 NVIDIA drivers

OS

Driver

How to Obtain

Ubuntu 16.04

GPU driver installation package NVIDIA-Linux-x86_64-375.66.run

http://www.nvidia.com/download/driverResults.aspx/118955/en-us

CUDA Toolkit installation package cuda_8.0.61_375.26_linux.run

https://developer.nvidia.com/compute/cuda/8.0/Prod2/local_installers/cuda_8.0.61_375.26_linux-run

CentOS 7.4

GPU driver installation package NVIDIA-Linux-x86_64-375.66.run

http://www.nvidia.com/download/driverResults.aspx/118955/en-us

CUDA Toolkit installation package cuda_8.0.61_375.26_linux.run

https://developer.nvidia.com/compute/cuda/8.0/Prod2/local_installers/cuda_8.0.61_375.26_linux-run

Debian 9.0

GPU driver installation package NVIDIA-Linux-x86_64-384.81.run

http://www.nvidia.com/download/driverResults.aspx/124722/en-us

CUDA Toolkit installation package cuda_9.0.176_384.81_linux.run

https://developer.nvidia.com/compute/cuda/9.0/Prod/local_installers/cuda_9.0.176_384.81_linux-run

Ubuntu 16.04 64bit

  1. Log in to the target ECS and run the following command to switch to user root:

    sudo su

  2. (Optional) Install GCC and g++.

    Perform this step only if GCC and g++ have not been installed.

    apt-get install gcc

    apt-get install g++

    apt-get install make

  3. (Optional) Disable the Nouveau driver.

    Perform this step if the Nouveau driver has been installed on the target ECS. This prevents conflict with the NVIDIA driver installation.

    1. Run the following command to check whether the Nouveau driver is running on the target ECS:

      lsmod | grep nouveau

      • If yes, go to 3.b.
      • If no, go to 4.
    2. Add the following statements to the end of the /etc/modprobe.d/blacklist-nouveau.conf file (if the file is unavailable, create one):

      blacklist nouveau

      options nouveau modeset=0

    3. Run the following command to obtain initramfs again:

      update-initramfs -u

    4. Run the following command to restart the ECS:

      reboot

  4. (Optional) Disable the X service.

    If the ECS has been logged in using the GUI, disable the X service before installing the NVIDIA driver.

    1. Run the following command to switch to multi-user mode:

      systemctl set-default multi-user.target

    2. Run the following command to restart the ECS:

      reboot

  5. (Optional) Install the GPU driver.

    You can either use the GPU driver provided in the CUDA Toolkit installation package or download the required GPU driver. Unless otherwise specified, you are advised to install GPU driver NVIDIA-Linux-x86_64-375.66.run provided in Prerequisites, which has been fully verified.

    The following section describes general operations for downloading and installing the GPU driver.
    1. Upload the GPU driver installation package NVIDIA-Linux-x86_64-xxx.yy.run to the /tmp directory on the ECS.

      To download the GPU driver, log in at http://www.nvidia.com/Download/index.aspx?lang=en.

      Figure 1 Downloading the GPU driver
    2. Run the following command to install the GPU driver:

      sh ./NVIDIA-Linux-x86_64-xxx.yy.run

    3. Run the following command to delete the installation package:

      rm -f NVIDIA-Linux-x86_64-xxx.yy.run

  6. Install the CUDA Toolkit.

    Unless otherwise specified, you are advised to install CUDA Toolkit cuda_8.0.61_375.26_linux.run provided in Prerequisites, which has been fully verified.

    The following section describes general operations for downloading and installing the CUDA Toolkit.
    1. Upload the CUDA Toolkit installation package cuda_a.b.cc_xxx.yy_linux.run to the /tmp directory on the ECS.

      To download the CUDA Toolkit, log in at https://developer.nvidia.com/cuda-downloads.

    2. Run the following command to change the permission:

      chmod +x cuda_a.b.cc_xxx.yy_linux.run

    3. Run the following command to install the CUDA Toolkit:

      ./cuda_a.b.cc_xxx.yy_linux.run -toolkit -samples -silent -override --tmpdir=/tmp/

    4. Run the following command to delete the installation package:

      rm -f cuda_a.b.cc_xxx.yy_linux.run

    5. Run the following commands to check whether the installation is successful:

      cd /usr/local/cuda/samples/1_Utilities/deviceQueryDrv/

      make

      ./deviceQueryDrv

      If the terminal display contains "Result = PASS", both CUDA Toolkit and GPU driver have been installed.

      ./deviceQueryDrv Starting...  
         
       CUDA Device Query (Driver API) statically linked version   
       Detected 1 CUDA Capable device(s)  
         
       Device 0: "Tesla P100-PCIE-16GB"  
         CUDA Driver Version:                           8.0  
         CUDA Capability Major/Minor version number:    6.0  
         Total amount of global memory:                 16276 MBytes (17066885120 bytes)  
         (56) Multiprocessors, ( 64) CUDA Cores/MP:     3584 CUDA Cores  
         GPU Max Clock rate:                            1329 MHz (1.33 GHz)  
         Memory Clock rate:                             715 Mhz  
         Memory Bus Width:                              4096-bit  
         L2 Cache Size:                                 4194304 bytes  
         Max Texture Dimension Sizes                    1D=(131072) 2D=(131072, 65536) 3D=(16384, 16384, 16384)  
         Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers  
         Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers  
         Total amount of constant memory:               65536 bytes  
         Total amount of shared memory per block:       49152 bytes  
         Total number of registers available per block: 65536  
         Warp size:                                     32  
         Maximum number of threads per multiprocessor:  2048  
         Maximum number of threads per block:           1024  
         Max dimension size of a thread block (x,y,z): (1024, 1024, 64)  
         Max dimension size of a grid size (x,y,z):    (2147483647, 65535, 65535)  
         Texture alignment:                             512 bytes  
         Maximum memory pitch:                          2147483647 bytes  
         Concurrent copy and kernel execution:          Yes with 2 copy engine(s)  
         Run time limit on kernels:                     No  
         Integrated GPU sharing Host Memory:            No  
         Support host page-locked memory mapping:       Yes  
         Concurrent kernel execution:                   Yes  
         Alignment requirement for Surfaces:            Yes  
         Device has ECC support:                        Enabled  
         Device supports Unified Addressing (UVA):      Yes  
         Device PCI Domain ID / Bus ID / location ID:   0 / 0 / 6  
         Compute Mode:  
            < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >  
       Result = PASS 

CentOS 7.4

  1. Log in to the target ECS and run the following command to switch to user root:

    sudo su

  2. (Optional) Install GCC, g++, and kernel-devel.

    Perform this step only if GCC, g++, and kernel-devel have not been installed.

    yum install gcc

    yum install gcc-c++

    yum install make

    yum install kernel-devel-`uname -r`

  3. (Optional) Disable the Nouveau driver.

    Perform this step if the Nouveau driver has been installed on the target ECS. This prevents conflict with the NVIDIA driver installation.

    1. Run the following command to check whether the Nouveau driver is running on the target ECS:

      lsmod | grep nouveau

      • If yes, go to 3.b.
      • If no, go to 4.
    2. Add the following statements to the end of the /etc/modprobe.d/blacklist-nouveau.conf file (if the file is unavailable, create one):

      blacklist nouveau

      options nouveau modeset=0

    3. Run the following command to obtain initramfs again:

      dracut --force

    4. Run the following command to restart the ECS:

      reboot

  4. (Optional) Disable the X service.

    If the ECS has been logged in using the GUI, disable the X service before installing the NVIDIA driver.

    1. Run the following command to switch to multi-user mode:

      systemctl set-default multi-user.target

    2. Run the following command to restart the ECS:

      reboot

  5. (Optional) Install the GPU driver.

    You can either use the GPU driver provided in the CUDA Toolkit installation package or download the required GPU driver. Unless otherwise specified, you are advised to install GPU driver NVIDIA-Linux-x86_64-375.66.run provided in Prerequisites, which has been fully verified.

    The following section describes general operations for downloading and installing the GPU driver.
    1. Upload the GPU driver installation package NVIDIA-Linux-x86_64-xxx.yy.run to the /tmp directory on the ECS.

      To download the GPU driver, log in at http://www.nvidia.com/Download/index.aspx?lang=en.

      Figure 2 Downloading the GPU driver
    2. Run the following command to install the GPU driver:

      sh ./NVIDIA-Linux-x86_64-xxx.yy.run

    3. Run the following command to delete the installation package:

      rm -f NVIDIA-Linux-x86_64-xxx.yy.run

  6. Install the CUDA Toolkit.

    Unless otherwise specified, you are advised to install CUDA Toolkit cuda_8.0.61_375.26_linux.run provided in Prerequisites, which has been fully verified.

    The following section describes general operations for downloading and installing the CUDA Toolkit.
    1. Upload the CUDA Toolkit installation package cuda_a.b.cc_xxx.yy_linux.run to the /tmp directory on the ECS.

      To download the CUDA Toolkit, log in at https://developer.nvidia.com/cuda-downloads.

    2. Run the following command to change the permission:

      chmod +x cuda_a.b.cc_xxx.yy_linux.run

    3. Run the following command to install the CUDA Toolkit:

      ./cuda_a.b.cc_xxx.yy_linux.run -toolkit -samples -silent -override --tmpdir=/tmp/

    4. Run the following command to delete the installation package:

      rm -f cuda_a.b.cc_xxx.yy_linux.run

    5. Run the following commands to check whether the installation is successful:

      cd /usr/local/cuda/samples/1_Utilities/deviceQueryDrv/

      make

      ./deviceQueryDrv

      If the terminal display contains "Result = PASS", both CUDA Toolkit and GPU driver have been installed.

      ./deviceQueryDrv Starting...  
         
       CUDA Device Query (Driver API) statically linked version   
       Detected 1 CUDA Capable device(s)  
         
       Device 0: "Tesla P100-PCIE-16GB"  
         CUDA Driver Version:                           8.0  
         CUDA Capability Major/Minor version number:    6.0  
         Total amount of global memory:                 16276 MBytes (17066885120 bytes)  
         (56) Multiprocessors, ( 64) CUDA Cores/MP:     3584 CUDA Cores  
         GPU Max Clock rate:                            1329 MHz (1.33 GHz)  
         Memory Clock rate:                             715 Mhz  
         Memory Bus Width:                              4096-bit  
         L2 Cache Size:                                 4194304 bytes  
         Max Texture Dimension Sizes                    1D=(131072) 2D=(131072, 65536) 3D=(16384, 16384, 16384)  
         Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers  
         Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers  
         Total amount of constant memory:               65536 bytes  
         Total amount of shared memory per block:       49152 bytes  
         Total number of registers available per block: 65536  
         Warp size:                                     32  
         Maximum number of threads per multiprocessor:  2048  
         Maximum number of threads per block:           1024  
         Max dimension size of a thread block (x,y,z): (1024, 1024, 64)  
         Max dimension size of a grid size (x,y,z):    (2147483647, 65535, 65535)  
         Texture alignment:                             512 bytes  
         Maximum memory pitch:                          2147483647 bytes  
         Concurrent copy and kernel execution:          Yes with 2 copy engine(s)  
         Run time limit on kernels:                     No  
         Integrated GPU sharing Host Memory:            No  
         Support host page-locked memory mapping:       Yes  
         Concurrent kernel execution:                   Yes  
         Alignment requirement for Surfaces:            Yes  
         Device has ECC support:                        Enabled  
         Device supports Unified Addressing (UVA):      Yes  
         Device PCI Domain ID / Bus ID / location ID:   0 / 0 / 6  
         Compute Mode:  
            < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >  
       Result = PASS 

Debian 9.0

  1. Log in to the target ECS and run the following command to switch to user root:

    sudo su

  2. (Optional) Install the dependency software GCC and g++ of the NVIDIA driver.

    Perform this step only if GCC and g++ have not been installed.

    apt-get install gcc

    apt-get install g++

    apt-get install make

    apt-get install linux-headers-$(uname -r)

  3. (Optional) Disable the Nouveau driver.

    Perform this step if the Nouveau driver has been installed on the target ECS. This prevents conflict with the NVIDIA driver installation.

    1. Run the following command to check whether the Nouveau driver is running on the target ECS:

      lsmod | grep nouveau

      • If yes, go to 3.b.
      • If no, go to 4.
    2. Add the following statements to the end of the /etc/modprobe.d/blacklist-nouveau.conf file (if the file is unavailable, create one):

      blacklist nouveau

      options nouveau modeset=0

    3. Run the following command to obtain initramfs again:

      update-initramfs -u

    4. Run the following command to restart the ECS:

      reboot

  4. (Optional) Disable the X service.

    If the ECS has been logged in using the GUI, disable the X service before installing the NVIDIA driver.

    1. Run the following command to switch to multi-user mode:

      systemctl set-default multi-user.target

    2. Run the following command to restart the ECS:

      reboot

  5. (Optional) Install the GPU driver.

    You can either use the GPU driver provided in the CUDA Toolkit installation package or download the required GPU driver. Unless otherwise specified, you are advised to install GPU driver NVIDIA-Linux-x86_64-384.81.run provided in Prerequisites, which has been fully verified.

    The following section describes general operations for downloading and installing the GPU driver.
    1. Upload the GPU driver installation package NVIDIA-Linux-x86_64-xxx.yy.run to the /tmp directory on the ECS.

      To download the GPU driver, log in at http://www.nvidia.com/Download/index.aspx?lang=en.

      Figure 3 Downloading the GPU driver
    2. Run the following command to install the GPU driver:

      sh ./NVIDIA-Linux-x86_64-xxx.yy.run

    3. Run the following command to delete the installation package:

      rm -f NVIDIA-Linux-x86_64-xxx.yy.run

  6. Install the CUDA Toolkit.

    The CUDA Toolkit version required by Debian 9.0 GCC must be 9.0 or later. Unless otherwise specified, you are advised to install CUDA Toolkit cuda_9.0.176_384.81_linux.run provided in Prerequisites, which has been fully verified.

    The following section describes general operations for downloading and installing the CUDA Toolkit.
    1. Upload the CUDA Toolkit installation package cuda_a.b.cc_xxx.yy_linux.run to the /tmp directory on the ECS.

      To download the CUDA Toolkit, log in at https://developer.nvidia.com/cuda-downloads.

    2. Run the following command to change the permission:

      chmod +x cuda_a.b.cc_xxx.yy_linux.run

    3. Run the following command to install the CUDA Toolkit:

      ./cuda_a.b.cc_xxx.yy_linux.run -toolkit -samples -silent -override --tmpdir=/tmp/

    4. Run the following command to delete the installation package:

      rm -f cuda_a.b.cc_xxx.yy_linux.run

    5. Run the following commands to check whether the installation is successful:

      cd /usr/local/cuda/samples/1_Utilities/deviceQueryDrv/

      make

      ./deviceQueryDrv

      If the terminal display contains "Result = PASS", both CUDA Toolkit and GPU driver have been installed.

      ./deviceQueryDrv Starting...
       
      CUDA Device Query (Driver API) statically linked version 
      Detected 1 CUDA Capable device(s)
       
      Device 0: "Tesla P100-PCIE-16GB"
        CUDA Driver Version:                           9.0
        CUDA Capability Major/Minor version number:    6.0
        Total amount of global memory:                 16276 MBytes (17066885120 bytes)
        (56) Multiprocessors, ( 64) CUDA Cores/MP:     3584 CUDA Cores
        GPU Max Clock rate:                            1329 MHz (1.33 GHz)
        Memory Clock rate:                             715 Mhz
        Memory Bus Width:                              4096-bit
        L2 Cache Size:                                 4194304 bytes
        Max Texture Dimension Sizes                    1D=(131072) 2D=(131072, 65536) 3D=(16384, 16384, 16384)
        Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
        Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
        Total amount of constant memory:               65536 bytes
        Total amount of shared memory per block:       49152 bytes
        Total number of registers available per block: 65536
        Warp size:                                     32
        Maximum number of threads per multiprocessor:  2048
        Maximum number of threads per block:           1024
        Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
        Max dimension size of a grid size (x,y,z):    (2147483647, 65535, 65535)
        Texture alignment:                             512 bytes
        Maximum memory pitch:                          2147483647 bytes
        Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
        Run time limit on kernels:                     No
        Integrated GPU sharing Host Memory:            No
        Support host page-locked memory mapping:       Yes
        Concurrent kernel execution:                   Yes
        Alignment requirement for Surfaces:            Yes
        Device has ECC support:                        Enabled
        Device supports Unified Addressing (UVA):      Yes
        Supports Cooperative Kernel Launch:            Yes
        Supports MultiDevice Co-op Kernel Launch:      Yes
        Device PCI Domain ID / Bus ID / location ID:   0 / 0 / 6
        Compute Mode:
           < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
      Result = PASS