NVIDIA Quick Install Guides: Streamlining Your vGPU Setup

Setting up NVIDIA Virtual GPU (vGPU) technology can significantly enhance the performance of virtualized environments, especially for graphics-intensive applications. This guide provides a streamlined approach to installing and configuring NVIDIA vGPU, ensuring a quick and efficient setup process. Whether you are deploying on VMware vSphere or XenServer, these quick install guides will walk you through the essential steps.

Before diving into the installation, it’s crucial to ensure your system meets the necessary prerequisites. This typically involves having a supported Windows guest operating system configured within your chosen hypervisor.

Understanding GPU Display Modes for vGPU Deployment

Certain NVIDIA GPU boards come with factory settings that might not be immediately compatible with NVIDIA vGPU software. These GPUs often support multiple display modes, such as display-off and display-enabled modes. For optimal NVIDIA vGPU software deployments, it’s generally recommended to use the display-off mode.

The following table lists GPUs that support multiple display modes and their factory default settings:

GPU Mode as Supplied from the Factory
NVIDIA A40 Display-off
NVIDIA L40 Display-off
NVIDIA L40S Display-off
NVIDIA L20 Display-off
NVIDIA L20 liquid cooled Display-off
NVIDIA RTX 5000 Ada Display enabled
NVIDIA RTX 6000 Ada Display enabled
NVIDIA RTX A5000 Display enabled
NVIDIA RTX A5500 Display enabled
NVIDIA RTX A6000 Display enabled

It’s important to note that even if a GPU is factory-set to display-off mode (like NVIDIA A40), it might have been switched to display-enabled mode if its settings were previously altered.

To adjust the display mode for GPUs supporting this feature, you’ll need the displaymodeselector tool. You can request this tool from the NVIDIA Display Mode Selector Tool page on the NVIDIA Developer website.

Important Note: The displaymodeselector tool is exclusively compatible with the GPUs listed in the table above. Other GPUs supporting NVIDIA vGPU software generally do not require display mode switching, unless explicitly specified.

Quick Installation of NVIDIA Virtual GPU Manager

The NVIDIA Virtual GPU Manager is a critical component that must be installed on your hypervisor before you can configure guests for NVIDIA vGPU. The installation process varies depending on the hypervisor platform you are using. Below are quick guides for VMware vSphere and XenServer.

For more detailed instructions, always refer to the specific NVIDIA vGPU installation guide for your hypervisor version.

Quick Guide: Installing NVIDIA Virtual GPU Manager on VMware vSphere

The NVIDIA Virtual GPU Manager for VMware vSphere operates on the ESXi host and is distributed as a ZIP archive containing several software components.

Prerequisites:

  • Ensure you have downloaded the NVIDIA vGPU software ZIP archive from the NVIDIA Licensing Portal.
  • Extract the NVIDIA Virtual GPU Manager software components from the downloaded ZIP archive.

Installation Steps:

  1. Copy Component Files: Transfer the NVIDIA Virtual GPU Manager component files to your ESXi host’s datastore.

  2. Enter Maintenance Mode: Put the ESXi host into maintenance mode using the following command:

    esxcli system maintenanceMode set --enable true
  3. Install NVIDIA vGPU Hypervisor Host Driver: Install the host driver component using the esxcli command:

    esxcli software vib install -d /vmfs/volumes/datastore/host-driver-component.zip
  4. Install NVIDIA GPU Management Daemon: Install the GPU Management daemon component:

    esxcli software vib install -d /vmfs/volumes/datastore/gpu-management-daemon-component.zip

    Replace datastore, host-driver-component.zip, and gpu-management-daemon-component.zip with your actual datastore name and component file names.

  5. Exit Maintenance Mode: Disable maintenance mode:

    esxcli system maintenanceMode set --enable false
  6. Reboot ESXi Host: Reboot the ESXi host for the changes to take effect:

    reboot
  7. Verify Daemon Status: Check if the NVIDIA GPU Management daemon has started successfully:

    /etc/init.d/nvdGpuMgmtDaemon status
  8. Verify GPU Communication: Confirm that the NVIDIA kernel driver is communicating with the physical GPUs by running:

    nvidia-smi

    Successful execution will list all GPUs in your system.

Quick Guide: Installing NVIDIA Virtual GPU Manager on XenServer

For XenServer, the NVIDIA Virtual GPU Manager is provided as an RPM Package Manager (RPM) file, running within the XenServer Control Domain (dom0) shell.

Installation Steps:

  1. Copy RPM File: Copy the NVIDIA Virtual GPU Manager RPM file to the XenServer dom0 shell.

  2. Install RPM Package: Install the package using the rpm command:

    [root@xenserver ~]# rpm -iv NVIDIA-**.rpm

    *Replace NVIDIA-**.rpm with the actual RPM file name.*

  3. Reboot XenServer: Reboot the XenServer platform:

    [root@xenserver ~]# shutdown -r now
  4. Verify Installation: After reboot, verify the NVIDIA Virtual GPU Manager installation by checking for the NVIDIA kernel driver in the loaded kernel modules:

    [root@xenserver ~]# lsmod |grep nvidia

    A successful installation will show nvidia modules listed.

Managing ECC Memory for NVIDIA vGPU

Error Correcting Code (ECC) memory enhances data integrity by detecting and correcting memory errors. While beneficial, ECC memory support with NVIDIA vGPU depends on factors like GPU model, vGPU type, and hypervisor software version.

ECC memory is generally supported with C-series and Q-series vGPUs on compatible GPUs but not with A-series and B-series vGPUs. Even when using A-series and B-series vGPUs on physical GPUs with ECC enabled, enabling ECC for unsupported vGPU types might introduce overhead.

Enabling ECC memory on a physical GPU has the following effects:

  • ECC capability is exposed on all supported vGPUs on that physical GPU.
  • VMs supporting ECC memory will have it enabled (with the option to disable it within the VM).
  • ECC can be enabled or disabled per VM. Changing ECC in a VM does not affect the frame buffer available to vGPUs.

GPUs based on Pascal architecture and later generally support ECC memory with NVIDIA vGPU. You can use nvidia-smi -q to check the ECC status of a GPU.

Older GPUs like Tesla M60 and M6 support ECC in non-virtualized environments, but NVIDIA vGPU does not support ECC with these. In graphics mode, ECC is often disabled by default on these GPUs.

Compatibility issues may arise if your hypervisor software version doesn’t support ECC with NVIDIA vGPU. In such cases, if ECC is enabled, NVIDIA vGPU might fail to start. You’ll need to disable ECC memory on all GPUs if you are using NVIDIA vGPU in an unsupported environment.

Quick Guide: Disabling ECC Memory

Disable ECC memory if it’s unsuitable for your workloads or required due to compatibility issues.

Steps to Disable ECC:

  1. Check ECC Status: Use nvidia-smi -q to check the current ECC mode (Enabled or Disabled).

    nvidia-smi -q

    Example output showing ECC enabled.

  2. Disable ECC: Use nvidia-smi -e 0 to disable ECC. To disable for all GPUs, run:

    nvidia-smi -e 0

    To disable for a specific GPU (e.g., GPU with index 0000:02:00.0), use:

    nvidia-smi -i 0000:02:00.0 -e 0
  3. Reboot: Reboot the host or VM for changes to apply.

  4. Verify Disabled ECC: Re-run nvidia-smi -q to confirm ECC is now disabled.

    nvidia-smi -q

    Example output showing ECC disabled.

Quick Guide: Enabling ECC Memory

Enable ECC memory if it’s suitable for your workloads and supported by your environment.

Steps to Enable ECC:

  1. Check ECC Status: Use nvidia-smi -q to check the current ECC mode.

    nvidia-smi -q

    Example output showing ECC disabled.

  2. Enable ECC: Use nvidia-smi -e 1 to enable ECC. For all GPUs:

    nvidia-smi -e 1

    For a specific GPU (e.g., 0000:02:00.0):

    nvidia-smi -i 0000:02:00.0 -e 1
  3. Reboot: Reboot the host or VM.

  4. Verify Enabled ECC: Re-run nvidia-smi -q to confirm ECC is enabled.

    nvidia-smi -q

    Example output showing ECC enabled.

Attaching an NVIDIA vGPU Profile to a VM

To leverage the power of NVIDIA vGPU, you need to attach a vGPU profile to your virtual machines. The process varies depending on the hypervisor.

Quick Guide: Changing Default Graphics Type in VMware vSphere

After installing the vGPU Manager VIB on VMware vSphere, the default graphics type is set to “Shared”. For vGPU support, you must change this to “Shared Direct”.

Caution: Change the default graphics type before configuring vGPU. VM console output in vSphere Web Client is unavailable for vGPU-enabled VMs.

Steps to Change Graphics Type:

  1. Login to vCenter: Access vCenter Server using the vSphere Web Client.

  2. Navigate to Host Graphics: Select your ESXi host, go to the Configure tab, then Graphics, and click Host Graphics.

  3. Edit Host Graphics: On the Host Graphics tab, click Edit.

  4. Select Shared Direct: In the “Edit Host Graphics Settings” dialog, choose Shared Direct and click OK.

  5. Verify Graphics Devices: Go to the Graphics Devices tab and ensure the configured type for each physical GPU intended for vGPU is “Shared Direct”. If any show “Shared”, edit them:

    1. Select the GPU and click the Edit icon.
    2. Choose Shared Direct and click OK.
  6. Restart Services: Restart the ESXi host or restart the Xorg service and nv-hostengine. To restart services:

    1. For vSphere before 7.0 Update 1: Stop Xorg service.
    2. Stop nv-hostengine: nv-hostengine -t
    3. Wait 1 second.
    4. Start nv-hostengine: nv-hostengine -d
    5. For vSphere before 7.0 Update 1: Start Xorg service: /etc/init.d/xorg start
  7. Confirm Active Type: In the Graphics Devices tab, verify that both the “Active type” and “Configured type” are “Shared Direct” for each physical GPU.

Quick Guide: Configuring a vSphere VM with NVIDIA vGPU

Caution: VM console in vSphere Web Client is unavailable for vGPU VMs. Ensure you have an alternative access method like Omnissa Horizon or VNC before proceeding.

Configuration Steps:

The steps vary slightly between vSphere 8 and vSphere 7.

Configuring a vSphere 8 VM with NVIDIA vGPU

  1. Open VM Settings: In vCenter Web UI, right-click the VM and select Edit Settings.

  2. Add PCI Device: In the “Edit Settings” window, from the ADD NEW DEVICE menu, choose PCI Device.

  3. Select vGPU Type: In the “Device Selection” window, choose the desired vGPU type and click SELECT.

    Note: vCS vGPU types are not supported on VMware vSphere.

  4. Confirm Settings: Back in the “Edit Settings” window, click OK.

Configuring a vSphere 7 VM with NVIDIA vGPU

Repeat these steps for each vGPU if adding multiple to a single VM.

  1. Open VM Settings: In vCenter Web UI, right-click the VM and select Edit Settings.

  2. Go to Virtual Hardware: Click the Virtual Hardware tab.

  3. Add Shared PCI Device: In the “New device” list, select Shared PCI Device and click Add. “PCI device” field should auto-populate with “NVIDIA GRID vGPU”.

  4. Select GPU Profile: From the “GPU Profile” dropdown, choose the desired vGPU type and click OK.

    Note: vCS vGPU types are not supported on VMware vSphere.

  5. Reserve VM Memory: Ensure all VM memory is reserved:

    1. Edit VM settings.
    2. Expand “Memory” section.
    3. Click Reserve all guest memory (All locked).

Quick Guide: Configuring a XenServer VM with Virtual GPU

  1. Power Off VM: Ensure the VM is powered off.

  2. Open VM Properties: Right-click the VM in XenCenter, select Properties, and then select the GPU property.

  3. Select GPU Type: Choose the desired GPU type from the “GPU type” dropdown list.

  4. Start VM: Start the VM from XenCenter or using xe vm-start in the dom0 shell.

Installing NVIDIA vGPU Software Graphics Driver in Windows

After creating and booting a Windows VM with a vGPU profile, you need to install the NVIDIA vGPU software graphics driver to fully enable GPU functionality.

Installation Steps:

  1. Copy Driver Package: Transfer the NVIDIA Windows driver package to the guest VM.

  2. Execute Installer: Run the driver package to start the installation.

  3. Accept License Agreement: Click through and accept the license agreement.

  4. Select Express Installation: Choose Express Installation and click NEXT.

  5. Restart if Prompted: If prompted, choose Restart Now to reboot the VM, or reboot later.

  6. Verify Driver Installation: After reboot, verify the driver:

    1. Right-click on the desktop.
    2. Select NVIDIA Control Panel.
    3. Go to Help > System Information.

    The NVIDIA Control Panel will display the vGPU being used, its capabilities, and the installed NVIDIA driver version.

Configuring a Licensed Client for NVIDIA vGPU

To utilize NVIDIA vGPU software, a licensed client needs to obtain a license from an NVIDIA License System (NLS) service instance (either CLS or DLS).

The configuration process is similar for both CLS and DLS instances but depends on the client OS.

Quick Guide: Configuring a Licensed Client on Windows with Default Settings

Steps for Windows Client Configuration:

  1. Copy Client Configuration Token: Copy the client configuration token to %SystemDrive%:Program FilesNVIDIA CorporationvGPU LicensingClientConfigToken folder.
  2. Restart NvDisplayContainer Service: Restart the NvDisplayContainer service.

The NVIDIA service on the client should now automatically acquire a license from the CLS or DLS instance.

Verifying NVIDIA vGPU Software License Status

After client configuration, verify the license status to ensure successful licensing.

Verification Steps:

Run nvidia-smi -q or nvidia-smi --query from the licensed client VM, not the hypervisor host.

nvidia-smi -q

If licensed, the output will include “License Status : Licensed” along with the license expiration date.

nvidia-smi -q

This comprehensive guide provides quick install steps for NVIDIA vGPU, streamlining the setup process and enabling you to efficiently leverage NVIDIA vGPU technology in your virtualized environments. Remember to always consult the official NVIDIA documentation for the most detailed and up-to-date information.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *