HPC

CPU Core Count vs Clock Speeds

August 2, 2024
9 min read
Exx-Blog-CPU-Core-Count-vs-Clock-Speeds.jpg

How Core Count and Clock Speed Affect Performance

When configuring a system, the CPU or brain of the computer is a top priority. For AMD Ryzen and Intel Core, the product stack is easy to understand: the better the processor, the higher the core and clock speeds. But, with the workstation (AMD Threadripper & Intel Xeon W) and server processors (Intel Xeon Scalable and AMD EPYC), the product stack is long and confounding. Core counts, clock speeds, and capabilities are designed to fit a specific target workload.

Both core count and clock speed are crucial in determining the performance and efficiency of your target applications. Understanding the strengths, weaknesses, and trade-offs in core count and clock speed is essential for determining the Exxact systems you need to achieve the best computational performance for the money spent.

We will explore the details of core count, clock speeds, and what to prioritize for your next workload. These suggestions are generalized, and specific software can function differently from the next.

Core Count - Accelerating Individual Tasks

Core count is the number of processing units or cores in your CPU. Each core executes instructions independently for parallel processing where multiple instructions are executed simultaneously.

The more cores a CPU has, the more tasks it can handle simultaneously, crucial for HPC applications that can be divided into small tasks such as data analytics, and cloud virtualization. However, single-threaded and sequential workloads (relying on the previous computational step), cannot be sped up with additional cores.

Clock Speed - Accelerating Per Task

Clock speed is important for applications heavily reliant on sequential single-threaded tasks, where the workload cannot be effectively divided into parallel processes.

Clock speed, measured in gigahertz (GHz), represents the frequency at which a CPU's core can execute instructions. It indicates how many instructions a core can process in a given time. Higher clock speeds enable faster execution of instructions, resulting in quicker computation.

Certain HPC applications, such as simulations with single-threaded code or mathematical computations, are not easily parallelized. In these cases, the clock speed becomes crucial as it directly affects the time required to complete each task. Higher clock speeds lead to faster execution of individual instructions, resulting in quicker completion of single-threaded workloads.

More Cores or More Clock Speed?

The trend towards creating the densest CPU on the market has become the goal for manufacturers. In recent years, AMD has released their most recent 96-core EPYC 9654 CPU, and the 128-core AMD EPYC 9454, the densest x86 processor today. More cores allow for parallel processing, where tasks can be divided among the cores and executed simultaneously.

More isn’t Always Better

However, more cores aren’t better in every use case. The 128-core AMD EPYC 9495 runs at a below-average 2.25GHz base clock. But it isn’t made for per core speed; it is made with Cloud Native and Virtualization workloads and data center density in mind.

Cloud providers distribute groups of cores for light cloud workloads like fetching data, web apps, hosting, and microservices, which aren’t computationally heavy. In microservice workloads, more groups mean more work is done. If a workload called for more computationally heavy tasks like video rendering, these low-clock speed cores would struggle to keep up, leaving valuable time on the table.

To achieve optimal performance in any workload, balance core count and clock speed and determine the ideal configuration that addresses needs. A balanced approach is necessary in video rendering; choosing a processor with a moderate to high core count and a relatively high clock speed. We will go over the general recommended CPUs for each workload.

Some applications also price their licensing model on a per-core basis; For Ansys, enabling more cores requires purchasing additional licensing packages. Choose processors with the highest clock speeds to maintain competitive leadership while keeping costs to a minimum.

GPU Native Workloads

Exxact builds data center and enterprise workstations and solutions for any workload centers, but our bread and butter is GPU-accelerated computing. NVIDIA pioneered the use of GPUs’ parallel computing prowess in games and applied them to high-performance computing workloads like simulation and deep learning. Advancements in AI, drug discovery, and engineering simulation would not be possible without the use of GPU parallel computing.

CPUs have limited cores designed for computing complex tasks, whereas GPUs are strictly tasked with math calculations. When an application is GPU native, all calculations are offloaded to the GPU, and the CPU cores are left idle until it receives and exports data points. The retrieval of these data points is only sped up by high clock speeds. Applications like AMBER for molecular dynamics, Ansys Fluent for CFD simulation, and training AI all use predominantly GPU computing and varying low levels of their reliance on CPU power. A faster and lower core count CPU would best fit the workload.

However, some workloads are GPU-accelerated (as opposed to native) and utilize the GPUs in certain processes in a workload while still relying on CPUs for most of the computation. This includes workloads like Finite Element Analysis or Data Analytics, both need to process data, run calculations, and sequentially analyze all data.

Exxact has worked with thousands of customers over our lifespan, encountering numerous workloads. That’s why we offer custom configurable solutions to increase productivity, inspire creativity, and fuel innovation for any type of computing. Our sales engineers are here to help configure the right system for your workload.

Fueling Innovation with an Exxact Multi-GPU Server

Accelerate your workload exponentially with the right system optimized to your use case. Exxact 4U Servers are not just a high-performance computer; it is the tool for propelling your innovation to new heights.

Configure Now

Choosing the CPU for Certain HPC Workloads

Certain applications and workloads benefit from both a high clock speed and an ample number of cores. Assuming your system is GPU-equipped, here are the suggested recommendations on what to prioritize for your CPU.

CPU RecommendationsHigh Clock SpeedBalancedHigh Core Count
AMD Threadripper7965WX
24 Cores | 4.2GHz
7985WX
64 Cores | 3.2GHz
7995WX
96 Cores | 2.5GHz
AMD EPYC9274F
24 Cores | 4.1GHz
9474F
48 Cores | 3.6GHz
9654
96 Core | 2.4GHz
Intel Xeon WW5-3425X
12 Cores | 3.20GHz
W9-3475X
36 Cores | 2.2GHz
W9-3495X
56 Core | 1.9GHz
Intel Xeon ScalableGold 6444Y
16 Cores | 3.6GHz
Platinum 8558P
48 Cores | 2.7GHz
Platinum 8592V
64 Cores | 2.0GHz

CPU for Molecular Dynamics & Cryo-EM

Prioritize higher clock speed in your CPU if you are GPU accelerated. Striking a balance between cores and clock speeds delivers the best results. If your application is fully GPU-native like AMBER or GROMACS, 2 or 4 CPU cores per GPU are more than sufficient.

If your workload is GPU accelerated like in Cryo-EM, opt for a processor with a balance of cores and clock speed. Here are some recommended CPUs for MD and Cryo-EM. Opt for Balanced or High Clocks.

CPU for FEA Engineering Simulation

In finite element analysis mechanical deformation simulation, GPUs are not utilized as the primary accelerator for calculating the simulation due to their sequential nature. The CPU takes up most of the workload. Therefore, a balanced core count and high clock speed would best fit the task. While more cores can accelerate the workload, fast cores should be prioritized since the number of CPUs can be scaled down as the workload increases. Opt for Balanced or High Core.

CPU for CFD Engineering Simulation

In computational fluid dynamics, there are CPU solvers and GPU solvers, the latter of which are drastically more performant. GPUs can accelerate simulations by over 10x and a single GPU is as powerful as 100 CPU cores. When running CFD simulations with GPUs, a higher clock speed CPU should be prioritized. Keep in the other workloads performed on this system, since you may need a balanced CPU for other simulation workloads that don't utilize GPU as heavily.

CPU for AI Training and Inferencing

Prioritize the core count even if more cores are not paramount for accelerating AI training. For example, a server with 8 GPUs could have 32 cores and more if data is being pulled elsewhere to cover data processing overhead. Decent clock speeds still contribute to improving the speed of data processing.

AI training can be distributed and extremely parallelized. Therefore, more CPU cores means the server can handle more tasks simultaneously. More simultaneous tasks executed allow for more scalability and training of larger models. Opt for High Core.

CPU for Video & 3D Rendering

We prioritize clock speeds while still having ample cores. If you have too many cores and lower clocks, your real-time viewing will stutter and be slower than preferred. Higher clocks will improve responsiveness in editing software and speed up real-time previews. The additional cores will help with exporting, encoding, and rendering. Opt for Balanced or High Core.

CPU for HPC Cloud Services and Virtualization

The more cores available in your cluster, the more instances can be run as independent services. If the virtualization clients are deployed for dense workloads, clock speeds should be considered. However, maximizing cores can enable more cloud instances and virtualized web applications to be launched. Opt for a server processor like AMD EPYC and Intel Xeon Scalable for 24/7 operation and opt for High Core.

Conclusions

It's important to consider that the ideal balance between clock speeds and core count can vary depending on the specific workload and software optimization. Different applications have different requirements, and it's crucial to assess the workload characteristics to determine the optimal configuration. These processors are suggestions to help guide you in the right direction and educate you in choosing the right processor for your workload.

It is good practice to check benchmarks, read the documentation, talk to application experts, and, of course, ask a professional like our team at Exxact. At Exxact, our team has not only encountered all kinds of workloads, but we configured systems to run their workload optimally and efficiently.

We're Here to Deliver the Tools to Power Your Research

With access to the highest performing hardware, at Exxact, we can offer the platform optimized for your deployment, budget, and desired performance so you can make an impact with your research!

Talk to an Engineer Today
Exx-Blog-CPU-Core-Count-vs-Clock-Speeds.jpg
HPC

CPU Core Count vs Clock Speeds

August 2, 20249 min read

How Core Count and Clock Speed Affect Performance

When configuring a system, the CPU or brain of the computer is a top priority. For AMD Ryzen and Intel Core, the product stack is easy to understand: the better the processor, the higher the core and clock speeds. But, with the workstation (AMD Threadripper & Intel Xeon W) and server processors (Intel Xeon Scalable and AMD EPYC), the product stack is long and confounding. Core counts, clock speeds, and capabilities are designed to fit a specific target workload.

Both core count and clock speed are crucial in determining the performance and efficiency of your target applications. Understanding the strengths, weaknesses, and trade-offs in core count and clock speed is essential for determining the Exxact systems you need to achieve the best computational performance for the money spent.

We will explore the details of core count, clock speeds, and what to prioritize for your next workload. These suggestions are generalized, and specific software can function differently from the next.

Core Count - Accelerating Individual Tasks

Core count is the number of processing units or cores in your CPU. Each core executes instructions independently for parallel processing where multiple instructions are executed simultaneously.

The more cores a CPU has, the more tasks it can handle simultaneously, crucial for HPC applications that can be divided into small tasks such as data analytics, and cloud virtualization. However, single-threaded and sequential workloads (relying on the previous computational step), cannot be sped up with additional cores.

Clock Speed - Accelerating Per Task

Clock speed is important for applications heavily reliant on sequential single-threaded tasks, where the workload cannot be effectively divided into parallel processes.

Clock speed, measured in gigahertz (GHz), represents the frequency at which a CPU's core can execute instructions. It indicates how many instructions a core can process in a given time. Higher clock speeds enable faster execution of instructions, resulting in quicker computation.

Certain HPC applications, such as simulations with single-threaded code or mathematical computations, are not easily parallelized. In these cases, the clock speed becomes crucial as it directly affects the time required to complete each task. Higher clock speeds lead to faster execution of individual instructions, resulting in quicker completion of single-threaded workloads.

More Cores or More Clock Speed?

The trend towards creating the densest CPU on the market has become the goal for manufacturers. In recent years, AMD has released their most recent 96-core EPYC 9654 CPU, and the 128-core AMD EPYC 9454, the densest x86 processor today. More cores allow for parallel processing, where tasks can be divided among the cores and executed simultaneously.

More isn’t Always Better

However, more cores aren’t better in every use case. The 128-core AMD EPYC 9495 runs at a below-average 2.25GHz base clock. But it isn’t made for per core speed; it is made with Cloud Native and Virtualization workloads and data center density in mind.

Cloud providers distribute groups of cores for light cloud workloads like fetching data, web apps, hosting, and microservices, which aren’t computationally heavy. In microservice workloads, more groups mean more work is done. If a workload called for more computationally heavy tasks like video rendering, these low-clock speed cores would struggle to keep up, leaving valuable time on the table.

To achieve optimal performance in any workload, balance core count and clock speed and determine the ideal configuration that addresses needs. A balanced approach is necessary in video rendering; choosing a processor with a moderate to high core count and a relatively high clock speed. We will go over the general recommended CPUs for each workload.

Some applications also price their licensing model on a per-core basis; For Ansys, enabling more cores requires purchasing additional licensing packages. Choose processors with the highest clock speeds to maintain competitive leadership while keeping costs to a minimum.

GPU Native Workloads

Exxact builds data center and enterprise workstations and solutions for any workload centers, but our bread and butter is GPU-accelerated computing. NVIDIA pioneered the use of GPUs’ parallel computing prowess in games and applied them to high-performance computing workloads like simulation and deep learning. Advancements in AI, drug discovery, and engineering simulation would not be possible without the use of GPU parallel computing.

CPUs have limited cores designed for computing complex tasks, whereas GPUs are strictly tasked with math calculations. When an application is GPU native, all calculations are offloaded to the GPU, and the CPU cores are left idle until it receives and exports data points. The retrieval of these data points is only sped up by high clock speeds. Applications like AMBER for molecular dynamics, Ansys Fluent for CFD simulation, and training AI all use predominantly GPU computing and varying low levels of their reliance on CPU power. A faster and lower core count CPU would best fit the workload.

However, some workloads are GPU-accelerated (as opposed to native) and utilize the GPUs in certain processes in a workload while still relying on CPUs for most of the computation. This includes workloads like Finite Element Analysis or Data Analytics, both need to process data, run calculations, and sequentially analyze all data.

Exxact has worked with thousands of customers over our lifespan, encountering numerous workloads. That’s why we offer custom configurable solutions to increase productivity, inspire creativity, and fuel innovation for any type of computing. Our sales engineers are here to help configure the right system for your workload.

Fueling Innovation with an Exxact Multi-GPU Server

Accelerate your workload exponentially with the right system optimized to your use case. Exxact 4U Servers are not just a high-performance computer; it is the tool for propelling your innovation to new heights.

Configure Now

Choosing the CPU for Certain HPC Workloads

Certain applications and workloads benefit from both a high clock speed and an ample number of cores. Assuming your system is GPU-equipped, here are the suggested recommendations on what to prioritize for your CPU.

CPU RecommendationsHigh Clock SpeedBalancedHigh Core Count
AMD Threadripper7965WX
24 Cores | 4.2GHz
7985WX
64 Cores | 3.2GHz
7995WX
96 Cores | 2.5GHz
AMD EPYC9274F
24 Cores | 4.1GHz
9474F
48 Cores | 3.6GHz
9654
96 Core | 2.4GHz
Intel Xeon WW5-3425X
12 Cores | 3.20GHz
W9-3475X
36 Cores | 2.2GHz
W9-3495X
56 Core | 1.9GHz
Intel Xeon ScalableGold 6444Y
16 Cores | 3.6GHz
Platinum 8558P
48 Cores | 2.7GHz
Platinum 8592V
64 Cores | 2.0GHz

CPU for Molecular Dynamics & Cryo-EM

Prioritize higher clock speed in your CPU if you are GPU accelerated. Striking a balance between cores and clock speeds delivers the best results. If your application is fully GPU-native like AMBER or GROMACS, 2 or 4 CPU cores per GPU are more than sufficient.

If your workload is GPU accelerated like in Cryo-EM, opt for a processor with a balance of cores and clock speed. Here are some recommended CPUs for MD and Cryo-EM. Opt for Balanced or High Clocks.

CPU for FEA Engineering Simulation

In finite element analysis mechanical deformation simulation, GPUs are not utilized as the primary accelerator for calculating the simulation due to their sequential nature. The CPU takes up most of the workload. Therefore, a balanced core count and high clock speed would best fit the task. While more cores can accelerate the workload, fast cores should be prioritized since the number of CPUs can be scaled down as the workload increases. Opt for Balanced or High Core.

CPU for CFD Engineering Simulation

In computational fluid dynamics, there are CPU solvers and GPU solvers, the latter of which are drastically more performant. GPUs can accelerate simulations by over 10x and a single GPU is as powerful as 100 CPU cores. When running CFD simulations with GPUs, a higher clock speed CPU should be prioritized. Keep in the other workloads performed on this system, since you may need a balanced CPU for other simulation workloads that don't utilize GPU as heavily.

CPU for AI Training and Inferencing

Prioritize the core count even if more cores are not paramount for accelerating AI training. For example, a server with 8 GPUs could have 32 cores and more if data is being pulled elsewhere to cover data processing overhead. Decent clock speeds still contribute to improving the speed of data processing.

AI training can be distributed and extremely parallelized. Therefore, more CPU cores means the server can handle more tasks simultaneously. More simultaneous tasks executed allow for more scalability and training of larger models. Opt for High Core.

CPU for Video & 3D Rendering

We prioritize clock speeds while still having ample cores. If you have too many cores and lower clocks, your real-time viewing will stutter and be slower than preferred. Higher clocks will improve responsiveness in editing software and speed up real-time previews. The additional cores will help with exporting, encoding, and rendering. Opt for Balanced or High Core.

CPU for HPC Cloud Services and Virtualization

The more cores available in your cluster, the more instances can be run as independent services. If the virtualization clients are deployed for dense workloads, clock speeds should be considered. However, maximizing cores can enable more cloud instances and virtualized web applications to be launched. Opt for a server processor like AMD EPYC and Intel Xeon Scalable for 24/7 operation and opt for High Core.

Conclusions

It's important to consider that the ideal balance between clock speeds and core count can vary depending on the specific workload and software optimization. Different applications have different requirements, and it's crucial to assess the workload characteristics to determine the optimal configuration. These processors are suggestions to help guide you in the right direction and educate you in choosing the right processor for your workload.

It is good practice to check benchmarks, read the documentation, talk to application experts, and, of course, ask a professional like our team at Exxact. At Exxact, our team has not only encountered all kinds of workloads, but we configured systems to run their workload optimally and efficiently.

We're Here to Deliver the Tools to Power Your Research

With access to the highest performing hardware, at Exxact, we can offer the platform optimized for your deployment, budget, and desired performance so you can make an impact with your research!

Talk to an Engineer Today