
What is NLP and what is LLM?
Natural Language Processing (NLP) is a branch of Artificial Intelligence (AI) that enables machines to understand and interpret human language. Recent advancements in deep learning have led to the emergence of Large Language Models (LLMs) displaying uncanny natural language understanding which have revolutionized the world and has significant impact on future. Startups and companies have opted to train these LLMs on dedicated hardware from NVIDIA: the DGX.
Large Language Models (LLMs) are a type of Language Model consisting of a neural network of parameters trained on massive amounts of unlabeled textual data. The most famous LLM is OpenAI's GPT (Generative Pre-trained Transformer) series, which has been trained on billions of words, the foundation for ChatGPT. Various applications use GPT as its foundation to build extremely convincing chatbots, summarizers, and more. LLMs have shown remarkable performance in a wide range of NLP tasks such as language translation, question-answering, and text generation. ChatGPT (trained originally on GPT-3) and ChatGPT Plus (trained on GPT-4) have made enormous waves in bringing AI to the public and consumer spotlight.
Enabling our computers to interact with our physical world is become a reality. LLMs have numerous applications in various industries like personalized chatbots, customer service automation, sentiment analysis, and content creation, or even code. So why do these large-scale organizations opt for an NVIDIA DGX? What is the difference between DGX and traditional PCIe GPUs?
NVIDIA DGX/HGX and the SXM GPU Form Factor
The SXM architecture is a high bandwidth socketed solution for connecting NVIDIA Tensor Core Accelerators to their proprietary DGX and HGX system. For each generation of NVIDIA Tensor Core GPUs (P100, V100, A100, and now H100), the DGX system HGX boards come with an SXM socket type that realizes high bandwidth, power delivery and more for their matching GPU daughter cards.
The specialized HGX system boards interconnect each of 8 GPU via NVLink enabling high GPU-to-GPU bandwidth. The capabilities of NVLink flows the data between GPUs extremely fast allowing them to operate as a single GPU beast, exchanging data without passing over PCIe or needing to communicate with the CPU. The NVIDIA DGX H100 connects 8 SXM5 H100s with a bandwidth of 900 GB/s per GPU via 4x NVLink Switch Chips for a total bidirectional bandwidth of over 7.2TB/s. Each H100 SXM GPU is also connected to the CPU via PCI Express so any data computed by any of the 8 GPUs can be relayed back to the CPU. We will go over the architectural schematics later.
NVIDIA H100 PCIe Form Factor
You just can’t achieve the same performance bandwidth connect with PCIe variants via NVLink Bridges that H100 PCIe come with. These bridges only connect GPUs together in pairs for that 600 GB/s bidirectional instead of the full 900GB/s over any of the 8 GPUs in the system.
Now don’t get it wrong, an NVIDIA H100 PCIe is an extremely capable GPUs that can be deployed at ease. They can be easily slotted into the data center that value the upgrade with minimal architectural changes. The H100 NVL extends the powerful PCIe card by pairing them together to have a total of 188GB HBM3 with comparable stated performances as the H100 SXM5.
Our Systems go through vigourous testing and validation. Explore Exxact Hooper H100 Solutions for both SXM and PCIe options!
The Difference between H100 SXM and PCIE
It is known in the data center and AI industry that NVIDIA DGX is literally gold. The best of the best and the most powerful AI machine. The most prominent being OpenAI training ChatGPT on their NVIDIA DGX system. In fact, OpenAI got hand delivered the first NVIDIA DGX-1 back in 2016.
Large corporations flock to the NVIDIA DGX not because it is shiny but because of its ability to scale. SXM GPUs are better suited for scale-up deployments with eight H100 GPUs fully interconnected with NVLink and NVSwitch interconnect technology. In DGX and HGX the way the 8 SXM GPUs connect differs from PCIe; each GPU is connected to 4 NVLink Switch chips essentially enabling all the GPUs operate as one big GPU. This scalability can further expand with NVIDIA NVLink Switch System to deploy and connect 256 DGX H100s to create an GPU accelerated AI factory.
On the other hand, H100 PCIe as found in the H100 NVL, only pairs of GPUs are connected via NVLink Bridge. GPU 1 is only directly connected to GPU 2, GPU 3 is only directly connected to GPU 4, etc. GPU 1 and GPU 8 are not directly connected and therefore can only communicate data over PCIe lanes, having to utilize CPU resources. The NVIDIA DGX and HGX system board has all the SXM GPUs interconnected through the NVLink Switch Chips and thus are not slowed down by the limitations of the PCIe bus when exchanging data between GPUs. Sending data back to the CPU will still happen over PCIe lanes.
By bypassing the PCI Express lanes when exchanging data between GPUs, the blistering fast SXM H100 GPUs can achieve maximum throughput with fewer slowdowns than its PCIe counterpart, perfect for training extremely large AI models that have huge amounts of data being used for training. The consumption of power and proprietary form factor are tradeoffs for peak performance which can extend the training and inference time. But when it comes to developing large language models, inferencing text to millions of people using your service, the most highest form of compute is desired to ensure stability, fluidity, and reliability.
What Should you Choose? H100 SXM or H100 PCIe?
It comes down to your use case. Large Language Models and Generative AI require ungodly amounts of performance. But the amount of users, the workload, and the training magnitude play a large part in picking the right system.
The NVIDIA H100’s DGX and HGX are best for organizations that can take advantage of the raw computing performance and not let anything go to waste. Constant training, inferencing, and operation can quickly reduce the total cost of ownership when used to its highest potential.
NVIDIA DGX has the best scalability and delivers performance that cannot be matched by any other server in its given form factor. Linking multiple NVIDIA DGX H100s with NVSwitch Systems can scale multiple DGX H100 into SuperPods for extremely large models. The NVIDIA DGX H100 comes in an 8U form factor with Dual Intel Xeon 8480C for 112 total CPU cores. NVIDIA DGX is non customizable and is the building block for a full scale AI compute infrastructure. With NVIDIA DGX when training your LLM can be scaled with ease. More DGX equates to faster training and more robust deployment.
The NVIDIA HGX offer great GPU performance in a single system that opens the option for users to customize. HGX platforms are customizable platforms offered by select partners (like Exxact) to deliver performance the customer wants - CPU, memory, storage, networking - while still taking advantage of the same 8x NVIDIA H100 SXM5 system board (with all the NVLink goodies included). These systems can address a data center’s needs, by selecting your own NICs, your desired number of Cores in a CPU, and sometimes additional storage. An NVIDIA HGX is similar to DGX in its compute capabilites while still accomodating and fitting to your needs for large scale LLM training.
The NVIDIA H100 PCIe variant is for those working with smaller workloads and want the ultimate flexibility in deciding the number of GPUs in your system. These GPUs still pack a punch when it comes to performance. Its got slightly less raw performance numbers but has the ease of being slotted into any computing infrastructure make these GPUs compelling. The H100 PCIe are also offered in smaller form factors such as 1U and 2Us for data centers to have 2x or 4x GPUs in a Single or Dual CPU configuration to offer the computing power for smaller LLM development. A more 1 to 1 CPU to GPU ratio are good for deploying more virtualization capabilities in inferencing as well as a miriad of different applications like analytics.
Choosing the right hardware is predicated on your specific requirements.
Talk to our experienced engineers today!

SXM vs PCIe: GPUs Best for Training LLMs like GPT-4
What is NLP and what is LLM?
Natural Language Processing (NLP) is a branch of Artificial Intelligence (AI) that enables machines to understand and interpret human language. Recent advancements in deep learning have led to the emergence of Large Language Models (LLMs) displaying uncanny natural language understanding which have revolutionized the world and has significant impact on future. Startups and companies have opted to train these LLMs on dedicated hardware from NVIDIA: the DGX.
Large Language Models (LLMs) are a type of Language Model consisting of a neural network of parameters trained on massive amounts of unlabeled textual data. The most famous LLM is OpenAI's GPT (Generative Pre-trained Transformer) series, which has been trained on billions of words, the foundation for ChatGPT. Various applications use GPT as its foundation to build extremely convincing chatbots, summarizers, and more. LLMs have shown remarkable performance in a wide range of NLP tasks such as language translation, question-answering, and text generation. ChatGPT (trained originally on GPT-3) and ChatGPT Plus (trained on GPT-4) have made enormous waves in bringing AI to the public and consumer spotlight.
Enabling our computers to interact with our physical world is become a reality. LLMs have numerous applications in various industries like personalized chatbots, customer service automation, sentiment analysis, and content creation, or even code. So why do these large-scale organizations opt for an NVIDIA DGX? What is the difference between DGX and traditional PCIe GPUs?
NVIDIA DGX/HGX and the SXM GPU Form Factor
The SXM architecture is a high bandwidth socketed solution for connecting NVIDIA Tensor Core Accelerators to their proprietary DGX and HGX system. For each generation of NVIDIA Tensor Core GPUs (P100, V100, A100, and now H100), the DGX system HGX boards come with an SXM socket type that realizes high bandwidth, power delivery and more for their matching GPU daughter cards.

The specialized HGX system boards interconnect each of 8 GPU via NVLink enabling high GPU-to-GPU bandwidth. The capabilities of NVLink flows the data between GPUs extremely fast allowing them to operate as a single GPU beast, exchanging data without passing over PCIe or needing to communicate with the CPU. The NVIDIA DGX H100 connects 8 SXM5 H100s with a bandwidth of 900 GB/s per GPU via 4x NVLink Switch Chips for a total bidirectional bandwidth of over 7.2TB/s. Each H100 SXM GPU is also connected to the CPU via PCI Express so any data computed by any of the 8 GPUs can be relayed back to the CPU. We will go over the architectural schematics later.
NVIDIA H100 PCIe Form Factor
You just can’t achieve the same performance bandwidth connect with PCIe variants via NVLink Bridges that H100 PCIe come with. These bridges only connect GPUs together in pairs for that 600 GB/s bidirectional instead of the full 900GB/s over any of the 8 GPUs in the system.
Now don’t get it wrong, an NVIDIA H100 PCIe is an extremely capable GPUs that can be deployed at ease. They can be easily slotted into the data center that value the upgrade with minimal architectural changes. The H100 NVL extends the powerful PCIe card by pairing them together to have a total of 188GB HBM3 with comparable stated performances as the H100 SXM5.

Our Systems go through vigourous testing and validation. Explore Exxact Hooper H100 Solutions for both SXM and PCIe options!
The Difference between H100 SXM and PCIE
It is known in the data center and AI industry that NVIDIA DGX is literally gold. The best of the best and the most powerful AI machine. The most prominent being OpenAI training ChatGPT on their NVIDIA DGX system. In fact, OpenAI got hand delivered the first NVIDIA DGX-1 back in 2016.
Large corporations flock to the NVIDIA DGX not because it is shiny but because of its ability to scale. SXM GPUs are better suited for scale-up deployments with eight H100 GPUs fully interconnected with NVLink and NVSwitch interconnect technology. In DGX and HGX the way the 8 SXM GPUs connect differs from PCIe; each GPU is connected to 4 NVLink Switch chips essentially enabling all the GPUs operate as one big GPU. This scalability can further expand with NVIDIA NVLink Switch System to deploy and connect 256 DGX H100s to create an GPU accelerated AI factory.
.jpg?format=webp)
On the other hand, H100 PCIe as found in the H100 NVL, only pairs of GPUs are connected via NVLink Bridge. GPU 1 is only directly connected to GPU 2, GPU 3 is only directly connected to GPU 4, etc. GPU 1 and GPU 8 are not directly connected and therefore can only communicate data over PCIe lanes, having to utilize CPU resources. The NVIDIA DGX and HGX system board has all the SXM GPUs interconnected through the NVLink Switch Chips and thus are not slowed down by the limitations of the PCIe bus when exchanging data between GPUs. Sending data back to the CPU will still happen over PCIe lanes.

By bypassing the PCI Express lanes when exchanging data between GPUs, the blistering fast SXM H100 GPUs can achieve maximum throughput with fewer slowdowns than its PCIe counterpart, perfect for training extremely large AI models that have huge amounts of data being used for training. The consumption of power and proprietary form factor are tradeoffs for peak performance which can extend the training and inference time. But when it comes to developing large language models, inferencing text to millions of people using your service, the most highest form of compute is desired to ensure stability, fluidity, and reliability.
What Should you Choose? H100 SXM or H100 PCIe?
It comes down to your use case. Large Language Models and Generative AI require ungodly amounts of performance. But the amount of users, the workload, and the training magnitude play a large part in picking the right system.
The NVIDIA H100’s DGX and HGX are best for organizations that can take advantage of the raw computing performance and not let anything go to waste. Constant training, inferencing, and operation can quickly reduce the total cost of ownership when used to its highest potential.
NVIDIA DGX has the best scalability and delivers performance that cannot be matched by any other server in its given form factor. Linking multiple NVIDIA DGX H100s with NVSwitch Systems can scale multiple DGX H100 into SuperPods for extremely large models. The NVIDIA DGX H100 comes in an 8U form factor with Dual Intel Xeon 8480C for 112 total CPU cores. NVIDIA DGX is non customizable and is the building block for a full scale AI compute infrastructure. With NVIDIA DGX when training your LLM can be scaled with ease. More DGX equates to faster training and more robust deployment.
The NVIDIA HGX offer great GPU performance in a single system that opens the option for users to customize. HGX platforms are customizable platforms offered by select partners (like Exxact) to deliver performance the customer wants - CPU, memory, storage, networking - while still taking advantage of the same 8x NVIDIA H100 SXM5 system board (with all the NVLink goodies included). These systems can address a data center’s needs, by selecting your own NICs, your desired number of Cores in a CPU, and sometimes additional storage. An NVIDIA HGX is similar to DGX in its compute capabilites while still accomodating and fitting to your needs for large scale LLM training.
The NVIDIA H100 PCIe variant is for those working with smaller workloads and want the ultimate flexibility in deciding the number of GPUs in your system. These GPUs still pack a punch when it comes to performance. Its got slightly less raw performance numbers but has the ease of being slotted into any computing infrastructure make these GPUs compelling. The H100 PCIe are also offered in smaller form factors such as 1U and 2Us for data centers to have 2x or 4x GPUs in a Single or Dual CPU configuration to offer the computing power for smaller LLM development. A more 1 to 1 CPU to GPU ratio are good for deploying more virtualization capabilities in inferencing as well as a miriad of different applications like analytics.
Choosing the right hardware is predicated on your specific requirements.
Talk to our experienced engineers today!