Nvidia’s HGX-2 brings flexibility to GPU computing


GPU market leader Nvidia holds several GPU Technology Conferences (GTC) annually around the globe. It seems every show has some sort of major announcement where the company is pushing the limits of GPU computing and creating more options for customers. For example, at GTC San Jose, the company announced its NVSwitch architecture, which connects up to 16 GPUs over a single fabric, creating one massive, virtual GPU. This week at GTC Taiwan, it announced its HGX-2 server platform, which is a reference architecture enabling other server manufacturers to build their own systems. The DGX-2 server announced at GTC San Jose is built on the HGX-2 architecture.

Network World’s Marc Ferranti did a great job of covering the specifics of the announcement in this post, including the server partners that will build their own products using the reference architecture. I wanted to drill down a little deeper on the importance of the HGX-2 and the benefits it brings.

HGX-2 gets its horsepower from NVSwitch

In his post, Ferranti mentioned that the HGX-2 leverages the NVSwitch interconnect fabric. NVSwitch is a significant leap forward for GPU computing, and without it, the speeds the Nvidia is getting could not be achieved. As fast as PCI bus speeds have gotten, they are far too slow to feed multiple GPUs. By creating a single, virtual GPU, HGX-2 delivers 2 petaflops in a single server.

Server partners have flexibility in platform design using the HGX-2 base

Also, with AI and HPC, architectures will vary from data center to data center. HGX-2 is a base building block that enables the server ecosystem partners to build a full server platform that can meet the unique requirements of their customers. As an example, some hyper-scale customers prefer to have PCIe and networking cables in the back of the server, while some prefer them in the front. How the servers are powered can be done via a power bus bar for the rack or using an individual power supply in each server. The approach Nvidia is taking lets it do what it does best, and that’s deliver market-leader performance from GPU subsystems while allowing the server manufacturers to focus on system-level design, power, cooling and mechanicals. This should lead to faster innovation and new systems being developed to meet the constantly changing needs of the A.I. and machine-learning industries.

Leave a Reply

Your email address will not be published. Required fields are marked *