NVIDIA TESLA TURING SOLUTIONS
THE LATEST NVIDIA GPUs | FULL TURNKEY SOLUTIONS | 3 YEAR WARRANTY & SUPPORT
NVIDIA Tesla Turing: Optimized for Datacenter Applications
NVIDIA GPUs have become the standard industry solution for deep learning training, and GPU-based inference is gaining traction and is rapidly being adopted. Many of the world’s leading enterprises now employ NVIDIA GPUs for running inference applications both in the data center and edge devices. Several enterprises that have traditionally run inference applications on CPUs are now switching over to NVIDIA GPUs and getting amazing increases in performance with minimal effort. The NVIDIA Tesla T4 GPU, the first Turing-based GPU provides breakthrough performance with flexible multi-precision capabilities, from FP32 to FP16 to INT8, as well as INT4.
Tesla T4 delivers up to 40X Higher Inference Performance over CPU
The NVIDIA Tesla T4 GPU includes 2,560 CUDA Cores and 320 Tensor Cores, delivering up to 130 TOPs (Tera Operations per second) of INT8 and up to 260 TOPS of INT4 inferencing performance. Compared to CPU-based inferencing, the Tesla T4, powered by the new Turing Tensor Cores, delivers up to 40X higher inference performance.
Tesla T4 Delivers More than 50X the Energy Efficiency of CPU-based Inferencing
Turing GPU architecture, in addition to Turing Tensor Cores, includes several features to improve performance of data center applications. Some of the key features are:
Enhanced Video Engine
Compared to prior generation Pascal and Volta GPU architectures, Turing supports additional video decode formats such as HEVC 4:4:4 (8/10/12 bit), and VP9 (10/12 bit). The enhanced video engine in Turing is capable of decoding significantly higher number of concurrent video streams than equivalent Pascal based Tesla GPUs.
Turing Multi-Process Service
Turing GPU architecture inherits the enhanced Multi-Process Service (MPS) feature first introduced in the Volta architecture. Compared to Pascal-based Tesla GPUs, MPS on Tesla T4 improves inference performance for small batch sizes, reduces launch latency, improves Quality of Service, and enables servicing higher numbers of concurrent client requests.
Higher Memory Bandwidth and Larger Memory Size
With 16 GB of GPU memory and 320 GB/sec of memory bandwidth, Tesla T4 delivers almost double the memory bandwidth and twice the memory capacity of its predecessor the Tesla P4 GPU. With Tesla T4, hyperscale data centers can almost double their user density for Virtual Desktop Infrastructure (VDI) applications.
How Does the TESLA T4 Compare to the P4?
TESLA T4
900-2G183-0000-0012560
TENSOR CORES320
FP168.1 TFLOPS
MIXED PRECISION65 TOPS
INT8130 TOPS
INT4260 TOPS
INTERCONNECTPCIe
MEMROY16GB GDDR6
BANDWIDTH320+ GB/s
POWER75 watts
TESLA P4
900-2G414-0000-0012560
TENSOR CORES-
FP165.5 TFLOPS
MIXED PRECISION-
INT822 TOPS
INT4-
INTERCONNECTPCIe
MEMROY8GB GDDR5
BANDWIDTH192+ GB/s
POWER75 watts
Not sure what you need?
Let us know what kind of project you have planned.
We can help you decide.
Tell us what you want to do.
New Turing Tensor Cores Provide Multi-Precision for AI Inference
Enhanced Video Engine:
Tesla P4 Versus Tesla T4
NVIDIA DGX™ A100
The universal system for all AI workloads, offering unprecedented compute density, performance and flexibility in the world’s first 5 petaFLOPS AI system. Order yours today.
Learn More- Rack Height: 2U
- Processor Supported: 2x Intel Xeon Scalable Family
- Drive Bays: 8x 3.5" Hot-Swap (2x NVMe)
- Supports up to 4x Double-Wide cards
- Rack Height: 4U
- Processor Supported: 2x Intel Scalable Family
- Drive Bays: 14x 2.5" Hot-Swap
- Supports up to 8x Double-Wide cards
- Rack Height: 4U
- Processor Supported: 2x Intel Xeon Scalable Family
- Drive Bays: 24x 3.5" Hot-Swap
- Supports up to 20x NVIDIA Tesla T4 GPUs