NVIDIA TESLA TURING SOLUTIONS


NVIDIA Tesla Turing: Optimized for Datacenter Applications


NVIDIA GPUs have become the standard industry solution for deep learning training, and GPU-based inference is gaining traction and is rapidly being adopted. Many of the world’s leading enterprises now employ NVIDIA GPUs for running inference applications both in the data center and edge devices. Several enterprises that have traditionally run inference applications on CPUs are now switching over to NVIDIA GPUs and getting amazing increases in performance with minimal effort. The NVIDIA Tesla T4 GPU, the first Turing-based GPU provides breakthrough performance with flexible multi-precision capabilities, from FP32 to FP16 to INT8, as well as INT4.

Tesla T4 delivers up to 40X Higher Inference Performance over CPU


The NVIDIA Tesla T4 GPU includes 2,560 CUDA Cores and 320 Tensor Cores, delivering up to 130 TOPs (Tera Operations per second) of INT8 and up to 260 TOPS of INT4 inferencing performance. Compared to CPU-based inferencing, the Tesla T4, powered by the new Turing Tensor Cores, delivers up to 40X higher inference performance.

Tesla T4 Delivers More than 50X the Energy Efficiency of CPU-based Inferencing




Turing GPU architecture, in addition to Turing Tensor Cores, includes several features to improve performance of data center applications. Some of the key features are:


Enhanced Video Engine
Compared to prior generation Pascal and Volta GPU architectures, Turing supports additional video decode formats such as HEVC 4:4:4 (8/10/12 bit), and VP9 (10/12 bit). The enhanced video engine in Turing is capable of decoding significantly higher number of concurrent video streams than equivalent Pascal based Tesla GPUs.

Turing Multi-Process Service
Turing GPU architecture inherits the enhanced Multi-Process Service (MPS) feature first introduced in the Volta architecture. Compared to Pascal-based Tesla GPUs, MPS on Tesla T4 improves inference performance for small batch sizes, reduces launch latency, improves Quality of Service, and enables servicing higher numbers of concurrent client requests.

Higher Memory Bandwidth and Larger Memory Size
With 16 GB of GPU memory and 320 GB/sec of memory bandwidth, Tesla T4 delivers almost double the memory bandwidth and twice the memory capacity of its predecessor the Tesla P4 GPU. With Tesla T4, hyperscale data centers can almost double their user density for Virtual Desktop Infrastructure (VDI) applications.

How Does the TESLA T4 Compare to the P4?



CUDA CORES

2560

TENSOR CORES

320

FP16

8.1 TFLOPS

MIXED PRECISION

65 TOPS

INT8

130 TOPS

INT4

260 TOPS

INTERCONNECT

PCIe

MEMROY

16GB GDDR6

BANDWIDTH

320+ GB/s

POWER

75 watts



CUDA CORES

2560

TENSOR CORES

-

FP16

5.5 TFLOPS

MIXED PRECISION

-

INT8

22 TOPS

INT4

-

INTERCONNECT

PCIe

MEMROY

8GB GDDR5

BANDWIDTH

192+ GB/s

POWER

75 watts

Not Sure What You Need?


Let us know what kind of project you have
planned and we can help you decide.


TELL US WHAT YOU WANT TO DO
Exxact Server

New Turing Tensor Cores Provide Multi-Precision for AI Inference

Enhanced Video Engine:
Tesla P4 Versus Tesla T4


Exxact TensorEX TS2-673917-NTT 2U 2x Intel Xeon processor server - NVIDIA® Tesla® Turing Solution
MPN: TS2-673917-NTT
Contact sales for pricing
Exxact TensorEX TS4-1598415-NTT 4U 2x Intel Xeon Scalable family - 8x NVIDIA® Tesla® Turing GPU
MPN: TS4-1598415-NTT
Contact sales for pricing
Exxact TensorEX TS4-1910483-NTT 4U 2x Intel Xeon processor server - NVIDIA® Tesla Turing Solution
MPN: TS4-1910483-NTT
Contact sales for pricing
Exxact TensorEX TWS-1686525-NTT 2x Intel Xeon CPU - NVIDIA® Tesla® Turing Solution
MPN: TWS-1686525-NTT
Contact sales for pricing