Key Takeaways
On Monday, March 19, 2024 Nvidia unveiled a groundbreaking new generation of artificial intelligence (AI) chips and accompanying software designed for running AI models. This significant announcement, made at Nvidia’s developer conference in San Jose, underscores the company’s ongoing efforts to establish itself as the premier supplier for AI enterprises.
Since the emergence of the AI boom in late 2022, catalyzed in part by OpenAI’s ChatGPT, Nvidia has experienced remarkable growth. The company’s share price has surged five-fold, while total sales have more than tripled. Nvidia’s high-end server GPUs play a vital role in both training and deploying large-scale AI models, making them indispensable for companies operating in the AI space. Notably, tech giants like Microsoft and Meta have invested billions of dollars in acquiring Nvidia‘s chips.
Nvidia has unveiled its latest generation of AI graphics processors, dubbed “Blackwell.” Leading this new lineup is the GB200 chip, set to be released later this year. With the introduction of Blackwell, Nvidia aims to entice customers with more powerful chips, sparking a wave of new orders. Notably, demand remains high for the current “Hopper” H100s and similar chips, as companies and software developers strive to secure these cutting-edge technologies.
Nvidia follows a biennial tradition of updating its GPU architecture, delivering significant performance leaps with each iteration. Many AI models introduced in the past year were trained using Nvidia’s Hopper architecture. This was particularly evident in chips like the H100, which was unveiled in 2022.
According to Nvidia, processors based on the Blackwell architecture, such as the GB200, offer a substantial performance boost for AI enterprises. They boast 20 petaflops in AI performance compared to the 4 petaflops of the H100. This enhanced processing power empowers AI companies to train larger and more complex models, enabling advancements in various fields.
A standout feature of the Blackwell chip is its dedicated “transformer engine,” tailored specifically to optimize the performance of transformer-based AI, a foundational technology utilized in platforms like ChatGPT.
Physically, the Blackwell GPU is notable for its large size. It integrates two separately manufactured dies into a single chip produced by TSMC. Nvidia will offer the Blackwell chip as part of a comprehensive server solution known as the GB200 NVLink 2, comprising 72 Blackwell GPUs along with other Nvidia components tailored for AI model training.
Tech giants including Amazon, Google, Microsoft, and Oracle are reportedly set to offer access to the GB200 through their cloud services. This cutting-edge chip combines two B200 Blackwell GPUs with an Arm-based Grace CPU. Nvidia disclosed that Amazon Web Services plans to construct a server cluster equipped with a staggering 20,000 GB200 chips.
According to Nvidia, this system can deploy a model with 27 trillion parameters, a significant advancement compared to even the largest existing models like GPT-4. The latter reportedly boasts 1.7 trillion parameters. Many AI researchers believe that larger models with increased parameters and data could unlock unprecedented capabilities.
Although Nvidia has not disclosed the pricing for the new GB200 or the systems it will be integrated into, estimates from analysts suggest that Nvidia’s Hopper-based H100 typically ranges between $25,000 and $40,000 per chip. So, the entire systems would cost up to $200,000.
Nvidia has also introduced a new offering called NIM, which is short for Nvidia Inference Microservice. This is part of Nvidia’s enterprise software subscription.
NIM simplifies the utilization of older Nvidia GPUs for inference tasks, which involve running AI software. This innovation allows companies to leverage their existing inventory of Nvidia GPUs, numbering in the hundreds of millions, for ongoing AI operations. Unlike the resource-intensive process of training new AI models, inference demands less computational power. NIM empowers companies to deploy their own AI models instead of relying on AI-as-a-service solutions offered by providers like OpenAI.
The overarching strategy is to encourage customers who invest in Nvidia-based servers to enroll in Nvidia enterprise, priced at $4,500 per GPU per year for a license.
Nvidia will collaborate with leading AI companies such as Microsoft and Hugging Face to optimize their AI models for compatibility with all eligible Nvidia chips. Subsequently, developers can efficiently deploy these models on their own servers or Nvidia’s cloud-based servers using NIM, streamlining the setup process without sacrificing performance.