News

AMD 4th Gen EPYC Genoa Data Center CPU Now Available

November 15, 2022
11 min read
EXX-Blog-AMD-anounces-Ryzen-Genoa.jpg

On November 10th, 2022 AMD launched the long-awaited next-generation EPYC Server Processor, EPYC 9004 Series Data Center CPU, code-named Genoa. EPYC Genoa is the 4th generation to their lineup and features a whole lot of tech it’s hard to say in a single sentence. Astronomical core counts, higher power targets, improved instructions per clock, and leading-edge connectivity and support all enable EPYC Genoa to be the fastest, most highest-performing, and efficient CPU to date, capturing the essence of what AMD looks to achieve in their groundbreaking technology.

AMD over the last 5 years has fought to be at the top of the market for quite some time now. Gen 1 EPYC Naples and Gen 2 EPYC Rome competed fairly well in the data center CPU market offering good alternatives to the HPC market. Gen 3 EPYC Milan brought out the boxing gloves with excellent competition and more users migrating to the high-density high-core-count king, especially when they released Milan-X, a stacked L3 cache variant for lower latency and higher throughput. Now we are at Gen 4 EPYC Genoa, an announcement that has not only shown promise but performance leadership. AMD did not hold back in the development of this next-generation product and is looking to be the future of data center computing.

AMD EPYC Genoa delivers unmatched performance in both highly-threaded workloads while retaining the agility to perform lightly-threaded tasks. It is the first hybrid chiplet-based data center processor (5nm + 6nm), the first data center processor to adopt DDR5 and PCIe 5.0 and CXL Interface, and supports AVX-512 extensions for highly optimized HPC and AI workloads.


AMD 4th Gen EPYC Genoa solutions are ready to ship. Consolidate your computing into highly dense HPC nodes for next level performance and efficiency. Contact Sales Today.


EPYC Genoa Zen 4 Hybrid Chiplet Architecture

AMD EPYC Genoa houses its hybrid chiplet architecture in a larger package with up to twelve 5nm Core Compute Dies (CCDs), each packing eight cores and a center 6nm I/O die to tie all the chiplets together. The four additional CCDs compared to the previous gen Milan, necessitates a larger chip package and integrated heat spreader (IHS), which in turn helps improve cooling.

EPYC Genoa uses AMD’s Zen 4 cores for a 14% increase in IPC (instructions per clock) featuring up to 12 CCDs, 96 cores, and 192 threads an additional 50% more cores! Each CCD comes with 384MBs of shared L3 cache and 1MB of L2 cache per core for a 50% increase in L3 cache and a 100% increase in L2. The Genoa-X, set to release sometime 2023, incorporates 3D V-Cache for added L3 cache for what we expect to having mind boggling performance.

The Genoa Processors are built on the new SP5 socket; deployment of Genoa means a whole new platform. SP5 will support AMD’s future data center processors like Genoa-X (3D V-Cache variant), Bergamo (Cloud Native), and Siena (Edge Computing) all of which are slotted for delivery in 2023.

We know firsthand that switching platforms can be costly and cumbersome but with the added performance AMD EPYC Genoa offers, you can lower that total cost of ownership while saving space, energy, and time.

The Generational Changes to Zen 4 EPYC Genoa

DDR5 - Early Adopter for Faster Throughput

Since the use of Zen 4 Cores, AMD EPYC Genoa is only compatible with DDR5. This change brings higher bandwidth, better power and memory efficiency, improved scalability, and higher capacity DIMMs.

DDR5 currently is a turn-off to some users due to pricing and higher latency. But that is a small price to pay when you are able to have more capacity per node as well as double the throughput. EPYC Genoa can support up to 6TB/socket of memory capacity with 460GB/s of bandwidth over the entire memory stack. Having more memory can contribute to better performance for a plethora of workloads that require quick memory.

DDR5 may have high latency (as of today) but the latency is actually not that far off from DDR4. AMD compares last Generation EPYC with EPYC Genoa showing 2.3x higher bandwidth and single rank efficiency with a small trade off in latency of about 10%.

Furthermore, DDR5 places the PMIC onto the DIMM itself instead of relying on the motherboard to manage power. Power is a hot commodity for data centers and any savings for added performance elsewhere.

PCIe 5.0 - The Highest Interconnect

With so many cores and IO dies Genoa supports up to 128 PCIe Gen 5.0 Lanes and up to 160 lanes in dual CPU configurations. Genoa’s PCIe 5.0 delivers a 2x speed up in I/O bandwidth over last generation PCIe Gen 4.0, perfect for high throughput and responsive workloads such as cloud computing and AI/ML workloads. What the really important is the technology PCIe 5.0 brings is CXL.

CXL: The Future of Expansion

CXL or Compute Express Link is a cache coherent interconnect built on PCIe 5.0 that allows IO, cache, or memory expansion over the express lanes to accelerate various hardware such as CPU RAM, GPU DRAM, or network cards. The additional memory does not have to be bound to a single hardware but instead can be leveraged by other devices.

EPYC Genoa supports CXL 1.1+ Type 3 which targets memory expansion instead of acceleration. So if you needed additional DDR5 RAM for even more virtualization, it's as easy as slapping in an expansion card over PCIe 5.0.

AVX-512: Boosting Deep Learning, AI, and ML Workloads

AMD incorporated extensions of AVX-512 which drastically improves deep learning and AI workloads. Since Exxact is keen on HPC and AI workloads, AVX-512 is a huge deal for accelerating these heavy tasks and a big reason some users have stuck with Intel. AMD measured an approximate 4.2x increase in NLP throughput, 3x in Image Classification throughput, and 3.5x in Object Detection throughput when comparing their top-of-the-line EPYC 9654 vs their last-generation flagship EPYC 7763.

Determinism: Tailored Performance through Thermals

AMD has also implemented a new power management feature called Power Determinism. Similar to their consumer technology Precision Boost Overdrive (PBO), the CPU can throttle more performance if the given cooling and temperatures allow it to surpass its TDP. It also works inversely to protect the CPU from thermal throttling and only increases performance when headroom is available. This enables higher TDPs but even higher performance-per-watt for better efficiency. Power Determinism is only possible because of the acknowledgment that every piece of silicon is unique having higher purity and thus better performance limits.

Performance Numbers

3rd Party Sources have already had a chance to benchmark AMD EPYC Genoa and the results are astonishing. We are featuring our favorite Life Science applications for molecular dynamics and 3D molecule reconstruction in workloads such as GROMACS, NAMD, and RELION.

GROMACS: MPI CPU - Input: water_GMX50_bare

Processor & Cores

Ns per Day (more is better)

EPYC 9654 2P - PD (192C)

19.834

EPYC 9654 2P (192C)

18.686

EPYC 9554 2P - PD (128C)

18.382

EPYC 9554 2P (128C)

17.104

EPYC 9654 - PD (96C)

12.956

EPYC 7773X 2P (128C)

11.216

EPYC 9654 (96C)

11.147

EPYC 9554 (64C)

9.641

The AMD EPYC 9654 2P configuration performed at 1.66x the speed of the last generation 7773X 2P configuration and the 9654 single processor configuration matches it. The uplift in IPC, additional cores, DDR5, and AVX-512 contribute to an AMD EPYC Genoa single processor beating out a dual CPU configuration of their last generation EPYC Milan-X flagship with 3D V-Cache. This test shows how generational improvement coupled with automatic boost clocks with Power Determinism enables such high levels of performance; a single 9564 in Power Determinism mode pulls ahead with a 15% increase over 7773X.

NAMD: ATPase Simulation - 327,506 Atoms

Processor & Cores

Days/Ns (fewer is better)

EPYC 9654 2P - PD (192C)

0.11211

EPYC 9654 2P (192C)

0.12797

EPYC 9554 2P - PD (128C)

0.15564

EPYC 9554 2P (128C)

0.17214

EPYC 9654 1P - PD (96C)

0.22157

EPYC 7773X 2P (128C)

0.22366

EPYC 9654 (96C)

0.24798

EPYC 9554 (64C)

0.28101

With Power Determinism enabled, the EPYC Genoa nearly doubles the last generation’s EPYC 7773X performance numbers, where again a single EPYC Genoa 96C configuration is comparable to a dual 7773X configuration.

RELION for Cryo-EM: Basic - Device: CPU

Processor & Cores

Seconds (fewer is better)

EPYC 9654 2P - PD (192C)

127.04

EPYC 9654 2P (192C)

127.52

EPYC 9554 2P - PD (128C)

130.65

EPYC 9554 2P (128C)

131.60

EPYC 7773X 2P (128C)

147.36

EPYC 9654 1P - PD (96C)

245.03

EPYC 9654 (96C)

247.09

EPYC 9554 (64C)

256.41

For our RELION test, the results have changed and the performance gap between Genoa and Milan-X is a little less significant. RELION relies on the number of cores more so than the performance each core provides showing us the previous benchmarks' results of a 96-core unit beating out a 128-core configuration might not always be the case.

However, the results still show us that the EPYC 9554 dual CPU configuration pulled out ahead of the EPYC 7773X by a healthy 12% increase, a familiar performance jump we saw in AMD’s 14% IPC uplift.

(Benchmarks found on OpenBenchmarking.org).

What It Means for Data Centers

Having so much memory and PCIe lanes for interconnecting enables the improvement of deploying multi-tenant instances. In these types of workloads, more cores, more RAM, and more IO mean more virtual machines. Genoa’s improvements to security allow for over 1000 fully encrypted virtualization. AMD EPYC is a no-brainer for Virtual Private Servers.

These benchmarks and the rise in TDP might cause questions as to the efficiency of AMD’s newest CPUs. We can safely say that the increase in TDP can be overshadowed by the immense gains Genoa brings to data centers. Even so, consumers and data centers were looking to receive CPUs that pushed the power envelope further to gain even more performance and AMD has delivered in this generation’s CPU.

With Genoa in mind, we are very excited about what AMD has in store for us in the use case specific Zen 4 data center CPUs like Bergamo for cloud computing, Genoa-X for extreme cache technical computing, and Siena for edge and telecom.

And with all those cores, threads, and high throughput memory, these data center GPUs are perfect for any HPC workload you could throw at it, from training deep learning models and deploying AI algorithms, to solving and simulating complex molecular dynamics. AMD EPYC Genoa is the heart of the system empowering the GPUs to work to their fullest potential.


Looking to build your HPC infrastructure or upgrade and consolidate your rack space? Exxact is taking orders for AMD EPYC Genoa Server Solutions.
Contact our talented engineering and sales team today!

EXX-Blog-AMD-anounces-Ryzen-Genoa.jpg
News

AMD 4th Gen EPYC Genoa Data Center CPU Now Available

November 15, 202211 min read

On November 10th, 2022 AMD launched the long-awaited next-generation EPYC Server Processor, EPYC 9004 Series Data Center CPU, code-named Genoa. EPYC Genoa is the 4th generation to their lineup and features a whole lot of tech it’s hard to say in a single sentence. Astronomical core counts, higher power targets, improved instructions per clock, and leading-edge connectivity and support all enable EPYC Genoa to be the fastest, most highest-performing, and efficient CPU to date, capturing the essence of what AMD looks to achieve in their groundbreaking technology.

AMD over the last 5 years has fought to be at the top of the market for quite some time now. Gen 1 EPYC Naples and Gen 2 EPYC Rome competed fairly well in the data center CPU market offering good alternatives to the HPC market. Gen 3 EPYC Milan brought out the boxing gloves with excellent competition and more users migrating to the high-density high-core-count king, especially when they released Milan-X, a stacked L3 cache variant for lower latency and higher throughput. Now we are at Gen 4 EPYC Genoa, an announcement that has not only shown promise but performance leadership. AMD did not hold back in the development of this next-generation product and is looking to be the future of data center computing.

AMD EPYC Genoa delivers unmatched performance in both highly-threaded workloads while retaining the agility to perform lightly-threaded tasks. It is the first hybrid chiplet-based data center processor (5nm + 6nm), the first data center processor to adopt DDR5 and PCIe 5.0 and CXL Interface, and supports AVX-512 extensions for highly optimized HPC and AI workloads.


AMD 4th Gen EPYC Genoa solutions are ready to ship. Consolidate your computing into highly dense HPC nodes for next level performance and efficiency. Contact Sales Today.


EPYC Genoa Zen 4 Hybrid Chiplet Architecture

AMD EPYC Genoa houses its hybrid chiplet architecture in a larger package with up to twelve 5nm Core Compute Dies (CCDs), each packing eight cores and a center 6nm I/O die to tie all the chiplets together. The four additional CCDs compared to the previous gen Milan, necessitates a larger chip package and integrated heat spreader (IHS), which in turn helps improve cooling.

EPYC Genoa uses AMD’s Zen 4 cores for a 14% increase in IPC (instructions per clock) featuring up to 12 CCDs, 96 cores, and 192 threads an additional 50% more cores! Each CCD comes with 384MBs of shared L3 cache and 1MB of L2 cache per core for a 50% increase in L3 cache and a 100% increase in L2. The Genoa-X, set to release sometime 2023, incorporates 3D V-Cache for added L3 cache for what we expect to having mind boggling performance.

The Genoa Processors are built on the new SP5 socket; deployment of Genoa means a whole new platform. SP5 will support AMD’s future data center processors like Genoa-X (3D V-Cache variant), Bergamo (Cloud Native), and Siena (Edge Computing) all of which are slotted for delivery in 2023.

We know firsthand that switching platforms can be costly and cumbersome but with the added performance AMD EPYC Genoa offers, you can lower that total cost of ownership while saving space, energy, and time.

The Generational Changes to Zen 4 EPYC Genoa

DDR5 - Early Adopter for Faster Throughput

Since the use of Zen 4 Cores, AMD EPYC Genoa is only compatible with DDR5. This change brings higher bandwidth, better power and memory efficiency, improved scalability, and higher capacity DIMMs.

DDR5 currently is a turn-off to some users due to pricing and higher latency. But that is a small price to pay when you are able to have more capacity per node as well as double the throughput. EPYC Genoa can support up to 6TB/socket of memory capacity with 460GB/s of bandwidth over the entire memory stack. Having more memory can contribute to better performance for a plethora of workloads that require quick memory.

DDR5 may have high latency (as of today) but the latency is actually not that far off from DDR4. AMD compares last Generation EPYC with EPYC Genoa showing 2.3x higher bandwidth and single rank efficiency with a small trade off in latency of about 10%.

Furthermore, DDR5 places the PMIC onto the DIMM itself instead of relying on the motherboard to manage power. Power is a hot commodity for data centers and any savings for added performance elsewhere.

PCIe 5.0 - The Highest Interconnect

With so many cores and IO dies Genoa supports up to 128 PCIe Gen 5.0 Lanes and up to 160 lanes in dual CPU configurations. Genoa’s PCIe 5.0 delivers a 2x speed up in I/O bandwidth over last generation PCIe Gen 4.0, perfect for high throughput and responsive workloads such as cloud computing and AI/ML workloads. What the really important is the technology PCIe 5.0 brings is CXL.

CXL: The Future of Expansion

CXL or Compute Express Link is a cache coherent interconnect built on PCIe 5.0 that allows IO, cache, or memory expansion over the express lanes to accelerate various hardware such as CPU RAM, GPU DRAM, or network cards. The additional memory does not have to be bound to a single hardware but instead can be leveraged by other devices.

EPYC Genoa supports CXL 1.1+ Type 3 which targets memory expansion instead of acceleration. So if you needed additional DDR5 RAM for even more virtualization, it's as easy as slapping in an expansion card over PCIe 5.0.

AVX-512: Boosting Deep Learning, AI, and ML Workloads

AMD incorporated extensions of AVX-512 which drastically improves deep learning and AI workloads. Since Exxact is keen on HPC and AI workloads, AVX-512 is a huge deal for accelerating these heavy tasks and a big reason some users have stuck with Intel. AMD measured an approximate 4.2x increase in NLP throughput, 3x in Image Classification throughput, and 3.5x in Object Detection throughput when comparing their top-of-the-line EPYC 9654 vs their last-generation flagship EPYC 7763.

Determinism: Tailored Performance through Thermals

AMD has also implemented a new power management feature called Power Determinism. Similar to their consumer technology Precision Boost Overdrive (PBO), the CPU can throttle more performance if the given cooling and temperatures allow it to surpass its TDP. It also works inversely to protect the CPU from thermal throttling and only increases performance when headroom is available. This enables higher TDPs but even higher performance-per-watt for better efficiency. Power Determinism is only possible because of the acknowledgment that every piece of silicon is unique having higher purity and thus better performance limits.

Performance Numbers

3rd Party Sources have already had a chance to benchmark AMD EPYC Genoa and the results are astonishing. We are featuring our favorite Life Science applications for molecular dynamics and 3D molecule reconstruction in workloads such as GROMACS, NAMD, and RELION.

GROMACS: MPI CPU - Input: water_GMX50_bare

Processor & Cores

Ns per Day (more is better)

EPYC 9654 2P - PD (192C)

19.834

EPYC 9654 2P (192C)

18.686

EPYC 9554 2P - PD (128C)

18.382

EPYC 9554 2P (128C)

17.104

EPYC 9654 - PD (96C)

12.956

EPYC 7773X 2P (128C)

11.216

EPYC 9654 (96C)

11.147

EPYC 9554 (64C)

9.641

The AMD EPYC 9654 2P configuration performed at 1.66x the speed of the last generation 7773X 2P configuration and the 9654 single processor configuration matches it. The uplift in IPC, additional cores, DDR5, and AVX-512 contribute to an AMD EPYC Genoa single processor beating out a dual CPU configuration of their last generation EPYC Milan-X flagship with 3D V-Cache. This test shows how generational improvement coupled with automatic boost clocks with Power Determinism enables such high levels of performance; a single 9564 in Power Determinism mode pulls ahead with a 15% increase over 7773X.

NAMD: ATPase Simulation - 327,506 Atoms

Processor & Cores

Days/Ns (fewer is better)

EPYC 9654 2P - PD (192C)

0.11211

EPYC 9654 2P (192C)

0.12797

EPYC 9554 2P - PD (128C)

0.15564

EPYC 9554 2P (128C)

0.17214

EPYC 9654 1P - PD (96C)

0.22157

EPYC 7773X 2P (128C)

0.22366

EPYC 9654 (96C)

0.24798

EPYC 9554 (64C)

0.28101

With Power Determinism enabled, the EPYC Genoa nearly doubles the last generation’s EPYC 7773X performance numbers, where again a single EPYC Genoa 96C configuration is comparable to a dual 7773X configuration.

RELION for Cryo-EM: Basic - Device: CPU

Processor & Cores

Seconds (fewer is better)

EPYC 9654 2P - PD (192C)

127.04

EPYC 9654 2P (192C)

127.52

EPYC 9554 2P - PD (128C)

130.65

EPYC 9554 2P (128C)

131.60

EPYC 7773X 2P (128C)

147.36

EPYC 9654 1P - PD (96C)

245.03

EPYC 9654 (96C)

247.09

EPYC 9554 (64C)

256.41

For our RELION test, the results have changed and the performance gap between Genoa and Milan-X is a little less significant. RELION relies on the number of cores more so than the performance each core provides showing us the previous benchmarks' results of a 96-core unit beating out a 128-core configuration might not always be the case.

However, the results still show us that the EPYC 9554 dual CPU configuration pulled out ahead of the EPYC 7773X by a healthy 12% increase, a familiar performance jump we saw in AMD’s 14% IPC uplift.

(Benchmarks found on OpenBenchmarking.org).

What It Means for Data Centers

Having so much memory and PCIe lanes for interconnecting enables the improvement of deploying multi-tenant instances. In these types of workloads, more cores, more RAM, and more IO mean more virtual machines. Genoa’s improvements to security allow for over 1000 fully encrypted virtualization. AMD EPYC is a no-brainer for Virtual Private Servers.

These benchmarks and the rise in TDP might cause questions as to the efficiency of AMD’s newest CPUs. We can safely say that the increase in TDP can be overshadowed by the immense gains Genoa brings to data centers. Even so, consumers and data centers were looking to receive CPUs that pushed the power envelope further to gain even more performance and AMD has delivered in this generation’s CPU.

With Genoa in mind, we are very excited about what AMD has in store for us in the use case specific Zen 4 data center CPUs like Bergamo for cloud computing, Genoa-X for extreme cache technical computing, and Siena for edge and telecom.

And with all those cores, threads, and high throughput memory, these data center GPUs are perfect for any HPC workload you could throw at it, from training deep learning models and deploying AI algorithms, to solving and simulating complex molecular dynamics. AMD EPYC Genoa is the heart of the system empowering the GPUs to work to their fullest potential.


Looking to build your HPC infrastructure or upgrade and consolidate your rack space? Exxact is taking orders for AMD EPYC Genoa Server Solutions.
Contact our talented engineering and sales team today!