Engineering MPD

Simcenter STAR-CCM+ CPU vs GPU Runtime Benchmarks with Maya HTT

December 13, 2024
6 min read
Exx-blog-STAR-CCM-CPU-vs-GPU-Runtime-Benchmarks-with-MayaHTT (1).jpg

Harnessing GPU Acceleration for Simcenter STAR-CCM+

In the Computational Fluid Dynamics (CFD) field, speed and accuracy are the most important metrics. Engineers often struggle to find a happy balance between the two. Reducing the time to completion often means sacrificing model fidelity and resolution thus the overall accuracy of the model.

Siemens Simcenterâ„¢ STAR-CCM+â„¢ has implemented and improved upon its GPU-accelerated computing support in the past couple of years. Engineers are adopting GPU acceleration to speed up their simulation times while maintaining model complexity.

Exxact and MayaHTT, a partner and Independent Software Vendor for Simcenter STAR-CCM+, have collaborated to run some GPU runtime benchmarks to showcase the performance GPUs can offer over traditional CPU-only deployments. MayaHTT’s engineers ran their models on Exxact hardware to deliver these results.

Why GPU Acceleration Matters

The evolution of GPU computing has reached a point where performance matches CPU capabilities and, in many cases, surpasses them. For engineers and researchers, this means they are able to iterate faster and model larger simulations.

Industries ranging from aerospace to automotive to civil engineering can benefit from incorporating GPU-accelerated computing into their computing infrastructure, reducing time-to-market while improving product quality. The ability to simulate larger, more detailed models also opens doors for innovation.

  • Time: Less time to completion means more simulation tests run to fine-tune before production. This shorter wait time translates to getting results faster and accelerating the workflow process.
  • Complexity: Increased speedup even on highly complex models enables better representations of the simulation. While GPU VRAM is limited compared to CPU RAM, GPU vendors continue to release GPUs with increased VRAM capacities specifically for visualization and HPC workloads. NVIDIA GPUs such as the NVIDIA RTX 6000 Ada tested have 48GB of VRAM each.
  • Cost: A system featuring four NVIDIA RTX 6000 Ada has the equivalent performance of hundreds of CPU cores. However, hundreds of CPU cores require multi-node setups, increasing the total cost to run and maintain. However, a four-GPUs system can be configured in a workstation that sits on your engineer’s desk or in a single 2U server mounted in a rack.

NVIDIA RTX 6000 Ada STAR-CCM+ Benchmarks

Let’s get into the benchmarks that the team at MayaHTT ran utilizing Simcenter STAR-CCM+ 2406. We first had a benchmark on an external aerodynamics simulation of a motorsport race car known as Le Mans. This model is used in various Siemens benchmarks, with the steady segregated solver and K-Omega SST turbulence model. We tested one large and one small case.

GPUs are likely not fully saturated in the smaller 1.27 million cell simulation, given the minimal performance uplift. For smaller cell count simulations, the GPU cannot effectively ramp up and showcase performance prowess with the time to completion in less than 5 minutes. This case ran for 2000 iterations giving us a runtime of 0.08 seconds per iteration. On the larger 16 million cell count model, 4x NVIDIA RTX 6000 Ada delivers over 7.4x speedup.

To increase the memory requirement, we decided to run a more computationally expensive simulation with increased physics complexity. In this case, a Flamelet-based combustion simulation was studied. This simulation analyzes a lab flame known as SANDIA Flame D, commonly used to validate combustion CFD. The model uses a Flamelet Generated Manifold with Kinetic Rate closure and runs as an implicitly unsteady segregated flow with reaction modeling utilizing LES turbulence.

Here we can see a more likely representation of GPU acceleration versus CPU runtime. GPU-accelerated computing delivers over 6.2x speedup versus CPU-computing. An 18 million-cell Flamelet Combustion simulation solve times were cut from a week to a single day with 4x NVIDIA RTX 6000 Ada GPUs!

Key Takeaways

MayaHTT kept the configuration static during their benchmarks. When considering smaller-scale simulation models (1.27M cell, 4M cell, and other similar-sized use cases), 4x GPUs are not the optimal configuration for the price-to-performance. However, for larger and more complex simulations, we can expect a high degree of simulation speedup. Compared to 32 modern CPU cores, we can expect over 6x the performance in these use cases with 4x NVIDIA RTX 6000 Ada GPUs.

Another key thing to consider for your deployment is if your simulation requires a high degree of precision. The NVIDIA RTX 6000 Ada GPU is optimized for single-precision FP32 calculations and does not have native double-precision FP64 capabilities. Some other GPUs to consider that have native double-precision FP64 capabilities include:

  • NVIDIA H200 NVL 141GB
  • NVIDIA H100 NVL 94GB
  • NVIDIA A800 40GB Active

The NVIDIA H200 and H100 are data center GPUs that can only be outfitted in a server and deployed in a rack. However, these GPUs can deliver performance equivalent to multiple hundreds of CPU cores. The NVIDIA A800 40GB Active is based on the NVIDIA A100 data center GPUs but serves as a workstation-class GPU with an active cooler.

GPUs are not cheap, but they are drastically worth the time saving investment. For GPU-level performance with CPU-only hardware, the cost exceeds past just the processors; it includes the maintenance, networking, system memory, and whole new systems for managing a multi-CPU cluster. Time is money and the ability to not only run subsequent simulations but run them faster dramatically improves business workflow and quality. A multi-node CPU cluster is costly to run and maintain versus a single GPU-accelerated system.

If you're considering upgrading your hardware for CFD workflows, leveraging GPUs is no longer an option—it's a necessity. If your workflow can take advantage of GPU acceleration, invest in your computing as more and more solvers continue to migrate and adopt GPUs. Contact us to learn how Exxact can tailor GPU-powered systems to meet your CFD needs. Thanks to MayaHTT for running these benchmarks on Exxact hardware.

About Maya HTT

Maya HTT is an industry-leading software developer and engineering solutions provider focused on CAE, CAD, CAM, and PLM. A long-time partner of Siemens Digital Industries Software, Maya HTT collaborates in providing software, AI, and engineering services to help clients and partners worldwide boost performance, improve quality, drive down costs, reduce inefficiencies, and harness the value of their data.

Accelerate Simulations in STAR-CCM+ with GPUs

With the latest CPUs and most powerful GPUs available, accelerate your STAR-CCM+ simulation and CFD project optimized to your deployment, budget, and desired performance!

Configure Now
Exx-blog-STAR-CCM-CPU-vs-GPU-Runtime-Benchmarks-with-MayaHTT (1).jpg
Engineering MPD

Simcenter STAR-CCM+ CPU vs GPU Runtime Benchmarks with Maya HTT

December 13, 20246 min read

Harnessing GPU Acceleration for Simcenter STAR-CCM+

In the Computational Fluid Dynamics (CFD) field, speed and accuracy are the most important metrics. Engineers often struggle to find a happy balance between the two. Reducing the time to completion often means sacrificing model fidelity and resolution thus the overall accuracy of the model.

Siemens Simcenterâ„¢ STAR-CCM+â„¢ has implemented and improved upon its GPU-accelerated computing support in the past couple of years. Engineers are adopting GPU acceleration to speed up their simulation times while maintaining model complexity.

Exxact and MayaHTT, a partner and Independent Software Vendor for Simcenter STAR-CCM+, have collaborated to run some GPU runtime benchmarks to showcase the performance GPUs can offer over traditional CPU-only deployments. MayaHTT’s engineers ran their models on Exxact hardware to deliver these results.

Why GPU Acceleration Matters

The evolution of GPU computing has reached a point where performance matches CPU capabilities and, in many cases, surpasses them. For engineers and researchers, this means they are able to iterate faster and model larger simulations.

Industries ranging from aerospace to automotive to civil engineering can benefit from incorporating GPU-accelerated computing into their computing infrastructure, reducing time-to-market while improving product quality. The ability to simulate larger, more detailed models also opens doors for innovation.

  • Time: Less time to completion means more simulation tests run to fine-tune before production. This shorter wait time translates to getting results faster and accelerating the workflow process.
  • Complexity: Increased speedup even on highly complex models enables better representations of the simulation. While GPU VRAM is limited compared to CPU RAM, GPU vendors continue to release GPUs with increased VRAM capacities specifically for visualization and HPC workloads. NVIDIA GPUs such as the NVIDIA RTX 6000 Ada tested have 48GB of VRAM each.
  • Cost: A system featuring four NVIDIA RTX 6000 Ada has the equivalent performance of hundreds of CPU cores. However, hundreds of CPU cores require multi-node setups, increasing the total cost to run and maintain. However, a four-GPUs system can be configured in a workstation that sits on your engineer’s desk or in a single 2U server mounted in a rack.

NVIDIA RTX 6000 Ada STAR-CCM+ Benchmarks

Let’s get into the benchmarks that the team at MayaHTT ran utilizing Simcenter STAR-CCM+ 2406. We first had a benchmark on an external aerodynamics simulation of a motorsport race car known as Le Mans. This model is used in various Siemens benchmarks, with the steady segregated solver and K-Omega SST turbulence model. We tested one large and one small case.

GPUs are likely not fully saturated in the smaller 1.27 million cell simulation, given the minimal performance uplift. For smaller cell count simulations, the GPU cannot effectively ramp up and showcase performance prowess with the time to completion in less than 5 minutes. This case ran for 2000 iterations giving us a runtime of 0.08 seconds per iteration. On the larger 16 million cell count model, 4x NVIDIA RTX 6000 Ada delivers over 7.4x speedup.

To increase the memory requirement, we decided to run a more computationally expensive simulation with increased physics complexity. In this case, a Flamelet-based combustion simulation was studied. This simulation analyzes a lab flame known as SANDIA Flame D, commonly used to validate combustion CFD. The model uses a Flamelet Generated Manifold with Kinetic Rate closure and runs as an implicitly unsteady segregated flow with reaction modeling utilizing LES turbulence.

Here we can see a more likely representation of GPU acceleration versus CPU runtime. GPU-accelerated computing delivers over 6.2x speedup versus CPU-computing. An 18 million-cell Flamelet Combustion simulation solve times were cut from a week to a single day with 4x NVIDIA RTX 6000 Ada GPUs!

Key Takeaways

MayaHTT kept the configuration static during their benchmarks. When considering smaller-scale simulation models (1.27M cell, 4M cell, and other similar-sized use cases), 4x GPUs are not the optimal configuration for the price-to-performance. However, for larger and more complex simulations, we can expect a high degree of simulation speedup. Compared to 32 modern CPU cores, we can expect over 6x the performance in these use cases with 4x NVIDIA RTX 6000 Ada GPUs.

Another key thing to consider for your deployment is if your simulation requires a high degree of precision. The NVIDIA RTX 6000 Ada GPU is optimized for single-precision FP32 calculations and does not have native double-precision FP64 capabilities. Some other GPUs to consider that have native double-precision FP64 capabilities include:

  • NVIDIA H200 NVL 141GB
  • NVIDIA H100 NVL 94GB
  • NVIDIA A800 40GB Active

The NVIDIA H200 and H100 are data center GPUs that can only be outfitted in a server and deployed in a rack. However, these GPUs can deliver performance equivalent to multiple hundreds of CPU cores. The NVIDIA A800 40GB Active is based on the NVIDIA A100 data center GPUs but serves as a workstation-class GPU with an active cooler.

GPUs are not cheap, but they are drastically worth the time saving investment. For GPU-level performance with CPU-only hardware, the cost exceeds past just the processors; it includes the maintenance, networking, system memory, and whole new systems for managing a multi-CPU cluster. Time is money and the ability to not only run subsequent simulations but run them faster dramatically improves business workflow and quality. A multi-node CPU cluster is costly to run and maintain versus a single GPU-accelerated system.

If you're considering upgrading your hardware for CFD workflows, leveraging GPUs is no longer an option—it's a necessity. If your workflow can take advantage of GPU acceleration, invest in your computing as more and more solvers continue to migrate and adopt GPUs. Contact us to learn how Exxact can tailor GPU-powered systems to meet your CFD needs. Thanks to MayaHTT for running these benchmarks on Exxact hardware.

About Maya HTT

Maya HTT is an industry-leading software developer and engineering solutions provider focused on CAE, CAD, CAM, and PLM. A long-time partner of Siemens Digital Industries Software, Maya HTT collaborates in providing software, AI, and engineering services to help clients and partners worldwide boost performance, improve quality, drive down costs, reduce inefficiencies, and harness the value of their data.

Accelerate Simulations in STAR-CCM+ with GPUs

With the latest CPUs and most powerful GPUs available, accelerate your STAR-CCM+ simulation and CFD project optimized to your deployment, budget, and desired performance!

Configure Now