Systems with multiple GPUs and CPUs are becoming common in a variety of industries as developers rely on more parallelism in applications like AI computing. These include 4-GPU and 8-GPU system configurations using PCIe system interconnect to solve very large, complex problems. But PCIe bandwidth is increasingly becoming the bottleneck at the multi-GPU system level, driving the need for a faster and more scalable multiprocessor interconnect.
NVIDIA® NVLinkTM technology addresses this interconnect issue by providing higher bandwidth, more links, and improved scalability for multi-GPU and multi-GPU/CPU system configurations. A single NVIDIA Tesla® V100 GPU supports up to six NVLink connections and total bandwidth of 300 GB/sec—10X the bandwidth of PCIe Gen 3. The NVLink implementation in Tesla P100/V100 supports up to four Links, allowing for a gang with an aggregate maximum theoretical bandwidth of 160 GB/sec bidirectional bandwidth.
New Levels of GPU-To-GPU Acceleration
First introduced with the NVIDIA PascalTM architecture, NVLink on Tesla V100 has increased the signaling rate from 20 to 25 GB/second in each direction. It can be used for GPU-to-CPU or GPU-to-GPU communication, as in the DGX-1V server with Tesla V100.
NVLINK DELIVERS UP TO 46% SLEEDUP VS PCIE
Server Config: Dual Xeon E5-2699 v4 2.6 GHz | 8x Tesla V100 PCIe or NVLink ResNet-50 Training for 90 Epochs with 1.28M ImageNet dataset
NVIDIA NVLink can bring up to 31% more performance to an otherwise identically configured server. Its dramatically higher bandwidth and reduced latency will enable even larger deep learning workloads to scale in performance as they continue to grow.