• Elastic Cloud Server

ecs
  1. Help Center
  2. Elastic Cloud Server
  3. User Guide
  4. ECS Instances
  5. (Optional) Installing a Driver and Toolkit
  6. Installing the NVIDIA GPU Driver and CUDA Toolkit on a P2 ECS

Installing the NVIDIA GPU Driver and CUDA Toolkit on a P2 ECS

Scenarios

If a P2 ECS is created using a private image, make sure that the NVIDIA driver has been installed during the private image creation. If not, install the driver after the P2 ECS is created for computing acceleration. For other types of ECSs, determine whether to install a driver according to the notes on using the ECSs in section ECS Types.

The procedure for installing the NVIDIA driver varies according to the OS. For details, see this section.

Prerequisites

  • The target ECS has had an EIP bound.
  • You have obtained the driver installation package required for an OS. For details, see Table 1.
Table 1 NVIDIA drivers

OS

Driver

Installation Package

How to Obtain

Ubuntu 16.04 or EulerOS 2.2 64bit

GPU driver

NVIDIA-Linux-x86_64-384.81.run

http://www.nvidia.com/download/driverResults.aspx/124722/en-us

CUDA Toolkit

cuda_9.0.176_384.81_linux.run

https://developer.nvidia.com/compute/cuda/9.0/Prod/local_installers/cuda_9.0.176_384.81_linux-run

Ubuntu 16.04 64bit

  1. Log in to the P2 ECS.
  2. Run the following command to switch to user root:

    sudo su

  3. (Optional) Install GCC and g++.

    Perform this step only if GCC and g++ have not been installed.

    apt-get update

    apt-get install gcc

    apt-get install g++

    apt-get install make

  4. (Optional) Disable the Nouveau driver.

    Perform this step if the Nouveau driver has been installed on the target ECS. This prevents conflict with the NVIDIA driver installation.

    1. Run the following command to check whether the Nouveau driver is running on the target ECS:

      lsmod | grep nouveau

      • If yes, go to 4.b.
      • If no, go to 5.
    2. Add the following statements to the end of the /etc/modprobe.d/blacklist-nouveau.conf file (if the file is unavailable, create one):

      blacklist nouveau

      options nouveau modeset=0

    3. Run the following command to obtain initramfs again:

      update-initramfs -u

    4. Run the following command to restart the ECS:

      reboot

  5. (Optional) Disable the X service.

    If the ECS has been logged in using the GUI, disable the X service before installing the NVIDIA driver.

    1. Run the following command to switch to multi-user mode:

      systemctl set-default multi-user.target

    2. Run the following command to restart the ECS:

      reboot

  6. (Optional) Install the GPU driver.

    You can either use the GPU driver provided in the CUDA Toolkit installation package or download the required GPU driver. Unless otherwise specified, you are advised to install GPU driver NVIDIA-Linux-x86_64-384.81.run provided in Prerequisites, which has been fully verified.

    1. Upload the GPU driver installation package NVIDIA-Linux-x86_64-384.81.run to the /tmp directory on the ECS.
    2. Run the following command to install the GPU driver:

      sh ./NVIDIA-Linux-x86_64-384.81.run

    3. Run the following command to delete the installation package:

      rm -rf NVIDIA-Linux-x86_64-384.81.run

  7. Install the CUDA Toolkit.

    Unless otherwise specified, you are advised to install CUDA Toolkit cuda_9.0.176_384.81_linux.run provided in Prerequisites, which has been fully verified.

    1. Upload the CUDA Toolkit installation package cuda_9.0.176_384.81_linux.run to the /tmp directory on the ECS.
    2. Run the following command to change the permission:

      chmod +x cuda_9.0.176_384.81_linux.run

    3. Run the following command to install the CUDA Toolkit:

      ./cuda_9.0.176_384.81_linux.run -toolkit -samples -silent -override --tmpdir=/tmp/

      root@user-OpenStack-Nova:/tmp# ./cuda_9.0.176_384.81_linux.run -toolkit -samples -silent -override --tmpdir=/tmp/  
       Missing recommended library: libGLU.so  
       Missing recommended library: libX11.so  
       Missing recommended library: libXi.so  
       Missing recommended library: libXmu.so  
       Missing recommended library: libGL.so  
         
       Copying samples to /root/NVIDIA_CUDA-8.0_Samples now...  
       Finished copying samples. 
    4. Run the following command to delete the installation package:

      rm -rf cuda_9.0.176_384.81_linux.run

    5. Run the following commands to check whether the installation is successful:

      cd /usr/local/cuda/samples/1_Utilities/deviceQueryDrv/

      make

      ./deviceQueryDrv

      If the terminal display contains "Result = PASS", both CUDA Toolkit and GPU driver have been installed.

      ./deviceQueryDrv Starting...
      CUDA Device Query (Driver API) statically linked version 
      Detected 1 CUDA Capable device(s)
      Device 0: "Tesla V100-PCIE-16GB"
        CUDA Driver Version:                           9.0
        CUDA Capability Major/Minor version number:    7.0
        Total amount of global memory:                 16152 MBytes (16936927232 bytes)
        (80) Multiprocessors, ( 64) CUDA Cores/MP:     5120 CUDA Cores
        GPU Max Clock rate:                            1380 MHz (1.38 GHz)
        Memory Clock rate:                             877 Mhz
        Memory Bus Width:                              4096-bit
        L2 Cache Size:                                 6291456 bytes
        Max Texture Dimension Sizes                    1D=(131072) 2D=(131072, 65536) 3D=(16384, 16384, 16384)
        Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
        Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
        Total amount of constant memory:               65536 bytes
        Total amount of shared memory per block:       49152 bytes
        Total number of registers available per block: 65536
        Warp size:                                     32
        Maximum number of threads per multiprocessor:  2048
        Maximum number of threads per block:           1024
        Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
        Max dimension size of a grid size (x,y,z):    (2147483647, 65535, 65535)
        Texture alignment:                             512 bytes
        Maximum memory pitch:                          2147483647 bytes
        Concurrent copy and kernel execution:          Yes with 7 copy engine(s)
        Run time limit on kernels:                     No
        Integrated GPU sharing Host Memory:            No
        Support host page-locked memory mapping:       Yes
        Concurrent kernel execution:                   Yes
        Alignment requirement for Surfaces:            Yes
        Device has ECC support:                        Enabled
        Device supports Unified Addressing (UVA):      Yes
        Supports Cooperative Kernel Launch:            Yes
        Supports MultiDevice Co-op Kernel Launch:      Yes
        Device PCI Domain ID / Bus ID / location ID:   0 / 0 / 6
        Compute Mode:
           < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
      Result = PASS 

EulerOS 2.2 64bit

  1. Log in to the P2 ECS.
  2. Run the following command to switch to user root:

    sudo su

  3. (Optional) Install GCC, g++, and kernel-devel.

    Perform this step only if GCC, g++, and kernel-devel have not been installed.

    yum install gcc

    yum install gcc-c++

    yum install make

    yum install kernel-devel-`uname -r`

  4. (Optional) Disable the Nouveau driver.

    Perform this step if the Nouveau driver has been installed on the target ECS. This prevents conflict with the NVIDIA driver installation.

    1. Run the following command to check whether the Nouveau driver is running on the target ECS:

      lsmod | grep nouveau

      • If yes, go to 4.b.
      • If no, go to 5.
    2. Add the following statements to the end of the /etc/modprobe.d/blacklist-nouveau.conf file (if the file is unavailable, create one):

      blacklist nouveau

      options nouveau modeset=0

    3. Run the following command to obtain initramfs again:

      dracut --force

    4. Run the following command to restart the ECS:

      reboot

  5. (Optional) Disable the X service.

    If the ECS has been logged in using the GUI, disable the X service before installing the NVIDIA driver.

    1. Run the following command to switch to multi-user mode:

      systemctl set-default multi-user.target

    2. Run the following command to restart the ECS:

      reboot

  6. (Optional) Install the GPU driver.

    You can either use the GPU driver provided in the CUDA Toolkit installation package or download the required GPU driver. Unless otherwise specified, you are advised to install GPU driver NVIDIA-Linux-x86_64-384.81.run provided in Prerequisites, which has been fully verified.

    1. Upload the GPU driver installation package NVIDIA-Linux-x86_64-384.81.run to the /tmp directory on the ECS.
    2. Run the following command to install the GPU driver:

      sh ./NVIDIA-Linux-x86_64-384.81.run

    3. Run the following command to delete the installation package:

      rm -rf NVIDIA-Linux-x86_64-384.81.run

  7. Install the CUDA Toolkit.

    Unless otherwise specified, you are advised to install the CUDA Toolkit cuda_9.0.176_384.81_linux.run provided in Prerequisites, which has been fully verified.

    1. Upload the CUDA Toolkit installation package cuda_9.0.176_384.81_linux.run to the /tmp directory on the ECS.
    2. Run the following command to change the permission:

      chmod +x cuda_9.0.176_384.81_linux.run

    3. Run the following command to install the CUDA Toolkit:

      ./cuda_9.0.176_384.81_linux.run -toolkit -samples -silent -override --tmpdir=/tmp/

      root@user-OpenStack-Nova:/tmp# ./cuda_9.0.176_384.81_linux.run -toolkit -samples -silent -override --tmpdir=/tmp/   
        Missing recommended library: libGLU.so   
        Missing recommended library: libX11.so   
        Missing recommended library: libXi.so   
        Missing recommended library: libXmu.so   
        Missing recommended library: libGL.so   
           
        Copying samples to /root/NVIDIA_CUDA-8.0_Samples now...   
        Finished copying samples. 
    4. Run the following command to delete the installation package:

      rm -rf cuda_9.0.176_384.81_linux.run

    5. Run the following commands to check whether the installation is successful:

      cd /usr/local/cuda/samples/1_Utilities/deviceQueryDrv/

      make

      ./deviceQueryDrv

      If the terminal display contains "Result = PASS", both CUDA Toolkit and GPU driver have been installed.

      ./deviceQueryDrv Starting... 
       CUDA Device Query (Driver API) statically linked version  
       Detected 1 CUDA Capable device(s) 
       Device 0: "Tesla V100-PCIE-16GB" 
         CUDA Driver Version:                           9.0 
         CUDA Capability Major/Minor version number:    7.0 
         Total amount of global memory:                 16152 MBytes (16936927232 bytes) 
         (80) Multiprocessors, ( 64) CUDA Cores/MP:     5120 CUDA Cores 
         GPU Max Clock rate:                            1380 MHz (1.38 GHz) 
         Memory Clock rate:                             877 Mhz 
         Memory Bus Width:                              4096-bit 
         L2 Cache Size:                                 6291456 bytes 
         Max Texture Dimension Sizes                    1D=(131072) 2D=(131072, 65536) 3D=(16384, 16384, 16384) 
         Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers 
         Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers 
         Total amount of constant memory:               65536 bytes 
         Total amount of shared memory per block:       49152 bytes 
         Total number of registers available per block: 65536 
         Warp size:                                     32 
         Maximum number of threads per multiprocessor:  2048 
         Maximum number of threads per block:           1024 
         Max dimension size of a thread block (x,y,z): (1024, 1024, 64) 
         Max dimension size of a grid size (x,y,z):    (2147483647, 65535, 65535) 
         Texture alignment:                             512 bytes 
         Maximum memory pitch:                          2147483647 bytes 
         Concurrent copy and kernel execution:          Yes with 7 copy engine(s) 
         Run time limit on kernels:                     No 
         Integrated GPU sharing Host Memory:            No 
         Support host page-locked memory mapping:       Yes 
         Concurrent kernel execution:                   Yes 
         Alignment requirement for Surfaces:            Yes 
         Device has ECC support:                        Enabled 
         Device supports Unified Addressing (UVA):      Yes 
         Supports Cooperative Kernel Launch:            Yes 
         Supports MultiDevice Co-op Kernel Launch:      Yes 
         Device PCI Domain ID / Bus ID / location ID:   0 / 0 / 6 
         Compute Mode: 
            < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > 
       Result = PASS