HPC

Taking A Deeper Look at the AMD Radeon Instinct GPUs for Deep Learning

December 8, 2017

4 min read

AMD Radeon Instinct

Introduced earlier this year, the AMD Radeon Instinct line of GPUs were a hot topic at the annual Supercomputing 2017 event. AMD announced its immediate availability of a suite of new, high performance system powered by AMD EPYC CPUs and AMD Radeon Instinct GPUs to accelerate innovation in supercomputing. AMD plans to combine this portfolio with software, featuring their new ROCm 1.8 open platform with updated development tools and libraries, enabling compute AMD EPYC-based PetaFLOPs systems.

But what do we know about the Radeon Instinct GPUs so far? To put it briefly, the Radeon Instinct line is dedicated for large-scale machine intelligence and deep learning data center applications. These new graphic cards produce some of the latest Radeon technology that boosts performance and deliver much high compute throughput in Deep Learning tasks. Despite having such advanced design and performance, AMD optimzed the Radeon Instinct to be cost-effective machine and deep learning inference, where workloads can take advantage of the acelerator's highly parallel computing capabilities. Fields such as government science labs, life science, financial, AI, higher academic institutions will all be ideal for data-centric HPC class systems with AMD Instinct products.

3 Different options to choose from to fit your needs:

	Compute Units	TFLOPs	Memory Size	Memory Bandwidth
Radeon Instinct MI25	64 nCU 4096 Stream Processors	24.6/12.3 FP16/FP32 Performance	16GB	484 GB/s
Radeon Instinct MI8	64 4096 Stream Processors	8.2 FP16/FP32 Performance	4GB	512 GB/s
Radeon Instinct MI6	36 2304 Stream Processors	5.7 FP16/FP32 Performance	16GB	224 GB/s

Compute Units

TFLOPs

Memory Size

Memory Bandwidth

Radeon Instinct MI25

64 nCU

4096 Stream Processors

24.6/12.3

FP16/FP32 Performance

16GB

484 GB/s

Radeon Instinct MI8

4096 Stream Processors

8.2

FP16/FP32 Performance

4GB

512 GB/s

Radeon Instinct MI6

2304 Stream Processors

5.7

FP16/FP32 Performance

16GB

224 GB/s

Radeon Instinct MI25: World's fastest training accelerator for machine intelligence and deep learning

The Radeon Instinct MI25 accelerator brings in a new era of compute for the datacenter with its Next-Gen “Vega” architecture delivering superior compute performance via its powerful parallel compute engine and Next-Gen programmable geometry pipeline improving processing efficiencies, while delivering 2x peak throughput-per-clock over previous Radeon architectures. The Radeon Instinct MI25 provides increased performance density, while decreasing energy consumption per operation making it the perfect solution for today’s demanding workloads in the datacenter.

Highlights:

Industry Leading Performance for Deep Learning
Next-Gen “Vega” Architecture
Advanced Memory Engine
Large BAR Support for Multi-GPU Peer to Peer
ROCm Open Software Platform for Rack Scale
Optimized MIOpen Libraries for Deep Learning
MxGPU Hardware Virtualization

Radeon Instinct MI8: Cost-sensitive, scalable accelerator for machine and deep learning inference applications

The Radeon Instinct MI8 accelerator based on AMD’s 3rd generation “Fiji” architecture with improved data-parallel processing and ultra-fast HBM1 memory delivers 8.2 TFLOPS of peak performance with up to 512 GB/s of memory bandwidth in a single, passively cooled GPU card. The MI8 accelerator, combined with AMD’s ROCm open software platform, is AMD's GPU solution for cost-sensitive system deployments for Machine Intelligence, Deep Learning, and HPC workloads, where performance and efficiency are key system requirements.

Highlights:

8.2 TFLOPS FP16 or FP32 Performance
Up To 47 GFLOPS Per Watt FP16 or FP32 Performance
4GB HBM1 on 512-bit Memory Interface
Passively Cooled Server Accelerator
Large BAR Support for Multi GPU Peer to Peer
ROCm Open Platform for HPC-Class Rack Scale
Optimized MIOpen Libraries for Deep Learning
MxGPU SR-IOV Hardware Virtualization

Radeon Instinct MI6: Versatile training and inference accelerator for machine intelligence and deep learning

The Radeon Instinct MI6 accelerator is based on AMD’s new 4th generation “Polaris” architecture. It is built on a 14nm FinFET process and has exceptional data-parallel processing capabilities featuring ultra-fast GDDR5 memory delivering 5.7 TFLOPS of peak performance with 16GB GDDR5 memory and up to 224 GB/s of memory bandwidth in a single, passively cooled GPU card. The MI6 accelerator, combined with AMD’s ROCm open software platform, is AMD's answer for efficiency and cost-sensitive inference and edge-training system deployments for Machine Intelligence and Deep learning, along with HPC workloads, where performance with large memory and efficiency are main system solution drivers.

Highlights:

5.7 TFLOPS FP16 or FP32 Performance
Up To 38 GFLOPS Per Watt Peak FP16 or FP32 Performance
16GB Ultra-Fast GDDR5 Memory on 256-bit Memory Interface
Passively Cooled Server Accelerator
Large BAR Support for Multi-GPU Peer to Peer
ROCm Open Platform for HPC-Class Scale-Out
Optimized MIOpen Libraries for Deep Learning
MxGPU SR-IOV Hardware Virtualization

Topics

Have any questions?

HPC