GTC Nvidia says it has designed an Arm-based general-purpose processor named Grace for training massive neural networks and powering supercomputers, and plans to ship it in 2023.
The news shouldn’t come as too much of a surprise. Nvidia has, for instance, its family of Arm-based system-on-chips dubbed the Tegra family, which are aimed at embedded electronics and the Internet of Things. It also hinted it wanted to build its own more powerful processors when it put in a $40bn bid to snap up British chip designer Arm last year.
Speaking from his kitchen to kick off Nvidia’s annual GPU Technology Conference, CEO Jensen Huang said on Monday the forthcoming processor was named after Rear Admiral Grace Hopper, the American computer scientist known for, among other things, assisting in the creation of the COBOL programming language.
Nvidia shrinks GPUs to help squeeze AI into your data center, make its VMware friendship work
Though graphics processors can, and are, being used to accelerate the training of neural networks and their inference operations, these GPUs rely on general-purpose processors to orchestrate the whole effort and direct data between software applications and the hardware accelerators. Thus, these host processors can form a bottleneck for the specialized GPU cores, which is a problem Nvidia wants to solve with Grace. This chip family is designed to work in tandem with Nvidia’s GPUs, and is said to sport a CPU-GPU interconnect speed of 900GB/s using the company’s NvLink technology.
For comparison, an Nvidia spokesperson told us that half an Nvidia DGX A100 system with one CPU connected to four GPUs has a total bandwidth of 64GB/s. “With Grace, the similar four GPU configuration will have four Grace CPUs each connected to a GPU via next generation NVLink at over 900GB/s of bidirectional bandwidth for an aggregate bi-directional bandwidth across four GPUs of nearly 4000 GB/s or nearly 2000 GB/s in one direction. So, four GPUs can access system memory at nearly 30x higher bandwidth,” we’re told.”
Grace also has a CPU-CPU interconnect speed of 600GB/s and a LPDDR5x memory bandwidth of 500GB/s, it is said. It will use Arm’s to-be-disclosed next-generation 64-bit Neoverse processor cores, and is aimed at running the “largest AI and high-performance computing workloads,” according to Nv. There really isn’t a lot of technical detail available yet.
“Nvidia’s introduction of the Grace data center CPU illustrates clearly how Arm’s licensing model enables an important invention, one that will further support the incredible work of AI researchers and scientists everywhere,” Arm CEO Simon Segars said in a statement.
Nvidia said it will begin shipping its Grace chips in 2023, and reckons the silicon will be able to train models containing a trillion parameters in mere days. The Swiss National Supercomputing Centre and America’s Los Alamos National Laboratory have put in orders to build Grace-powered AI supercomputers, it was announced.
More to come
Grace isn’t the only chip Nvidia has promised to deliver over the next few years. Here’s what else is said to be coming: the BlueField-3 DPU, its next-generation data processing unit; and the Nvidia Drive Atlan, a system-on-a-chip for autonomous cars.
Nvidia’s BlueField DPUs are designed to accelerate SmartNIC-style workloads, such as software-defined networking, storage operations, and security defenses. Essentially, you offload tasks from host processors to the DPU. BlueField can be connected to Nvidia’s GPUs, too. Huang said his mega-corp’s servers powering its cloud gaming platform GeForce Now uses BlueField chippery.
“Modern hyperscale clouds are driving a fundamental new architecture for data centers,” said the chief exec. “A new type of processor, designed to process data center infrastructure software, is needed to offload and accelerate the tremendous compute load of virtualization, networking, storage, security and other cloud-native AI services. The time for BlueField DPU has come.”
It’s essentially a beefy system-on-chip sporting 16 Arm Cortex-A78 CPU cores, totaling 22 billion transistors. It can handle up to 400 gigabits per second of Ethernet and InfiniBand connectivity, we’re told, and talk PCIe 5.0. The BlueField-3 is expected to be ten times more powerful than its predecessor, and is expected to sample in 2022 and ship thereafter. Huang also hinted that a BlueField-4 is coming in 2024 and that it can handle up to 800Gbps of network traffic.
Next up is Atlan. Not much is known about it other than it will deliver about four times the performance of Nvidia’s Orin self-driving car chip that’s due to ship in 2022, apparently. Expected to arrive three years later in 2025, Atlan will combine Nvidia’s next-generation GPU architecture, Arm CPU cores, and BlueField technology onto a single system-on-chip.
Nvidia also announced that its DGX SuperPODs with 20 or more Nvidia DGX A100 systems – that’s at least 160 80GB A100 GPUs – will be available to order this quarter; this equipment has a $60m price tag for a full configuration.
A DGX Station 320G, which is supposed to be more affordable, will be available, too, via a $9,000-a-month subscription or a one-off $149,000 payment.
Onto software. Nvidia’s Jarvis, a conversational artificial intelligence for voice-automated chat bots and the like, is now available. And Nv teased its cuQuantum SDK that’s supposed to help academics better simulate quantum computers using GPUs. ®