Nvidia unveils Blackwell GPU offering five times the performance for AI inference

Blackwell GPU contains 208 billion transistors and is paired with 192GB of HBM3e memory

Jensen Huang

Image:

Jensen Huang

Nvidia has unveiled its latest B200 Blackwell GPU that, it claims, is twice as powerful as current GPUs for training AI models, and offers five times the performance at inference, the speed with which AI models respond to queries.

The Blackwell architecture contains 208 billion transistors and will be manufactured on TSMC's four nanometre 4NP process node, a refined version of the 4N process used to produce existing Nvidia H100 and Ada Lovelace-architecture GPUs. The 4NP process integrates two independently manufactured dies that are bound together via the 10TB/second NVLink 5.0 interface.

The B200 will be paired with 192GB of high-performance HBM3e memory, produced by SK Hynix .

The new technologies were revealed this week at Nvidia's annual GTC conference in San Jose, California.

According to Nvidia, the Blackwell architecture features six "transformative" technologies for what its founder and CEO Jensen Huang describes as "accelerated computing".

"For three decades we've pursued accelerated computing, with the goal of enabling transformative breakthroughs like deep learning and AI," said Huang. "Generative AI is the defining technology of our time."

The company also lifted the lid on its Grace Blackwell GB200 superchip, which will combine two Blackwell GPUs with the company's Grace CPU. It will come as part of the Nvidia GB200 NVL72, a multi-node, liquid-cooled board intended for data centre AI training and inference. The NVL72 will combine 36 GB200 chips interconnected with NVLink to improve performance across the GB200 chips.

According to Nvidia: "The GB200 NVL72 provides up to a 30x performance increase compared to the same number of Nvidia H100 Tensor Core GPUs for LLM [large language model] inference workloads and reduces cost and energy consumption by up to 25-times."

The Grace CPU, meanwhile, is comprised of 72 Arm Neoverse V2 cores.

Nvidia enjoys an 80 per cent share of the market for AI processors and, as a result of the boom in demand for AI wrought by the success of ChatGPTv4, has ballooned in value. The company now has a market capitalisation of $2.2 trillion, becoming the third most valuable company in the world after Microsoft and Apple.

It achieved revenues of $60.9 billion and a net income of $29.8 billion in its most recent annual results, published just last month – up by 126 per cent and 581 per cent, respectively, on the previous year. These results reflect the surge in demand for data center processors capable of handling demanding AI applications.

Naturally, Nvidia lined up the biggest names in computing to back the launch of The Blackwell architecture and Grace Blackwell superchip, including Microsoft CEO Satya Nadella, Amazon CEO Andy Jassy, Oracle's Larry Ellison, and Google CEO Sunder Pichai.

However, the most interesting comments were provided by Mark Zuckerberg, founder and CEO of Meta, and Demis Hassabis, co-founder and CEO of Google DeepMind.

"AI already powers everything from our large language models to our content recommendations, ads, and safety systems, and it's only going to get more important in the future. We're looking forward to using Blackwell to help train our open-source Llama models and build the next generation of Meta AI and consumer products," said Zuckerberg.

Hassabis, meanwhile, described the "transformative potential" of AI as "incredible", adding: "Blackwell's breakthrough technological capabilities will provide the critical compute needed to help the world's brightest minds chart new scientific discoveries."

The Blackwell architecture was named in honour of David Harold Blackwell, a mathematician who specialized in game theory and statistics, and who was the first Black scholar inducted into the National Academy of Sciences.