NVIDIA announced another record-breaking quarter on Wednesday. Revenue surged to $39.3 billion, a 12% increase from the previous quarter and a 78% rise from the previous year.
“Demand for Blackwell is amazing as reasoning AI adds another scaling law—increasing compute for training makes models smarter and increasing compute for long thinking makes the answer smarter,” said Jensen Huang, founder and CEO of NVIDIA.
He said that the world is at a nascent stage of reasoning AI and inference-time scaling, and multimodal AIs, enterprise AI, sovereign AI, and physical AI are right around the corner. “We will grow strongly in 2025,” said Huang.
He further noted that much has been accomplished with AI in two years, whereas it took decades to develop certain technologies, highlighting a greater potential for the AI ecosystem.
“No technology has ever had the opportunity to address a larger part of the world’s GDP than AI. No software tool ever has. And so, this is now a software tool that can address a much larger part of the world’s GDP more than any time in history,” Huang said.
The company’s CFO, Colette M. Kress, stated that it generated $11 billion in Blackwell revenue to meet the increasing demand, marking the fastest product ramp in its history.
She added that at the upcoming GTC event, which is to be held between March 17 and March 21, the company will discuss Blackwell Ultra, Vera Rubin, and new computing and networking products.
Kress explained that with Blackwell, clusters of 100,000 GPUs or more will become common. “Shipments have already started for multiple infrastructures of this size.”
Last year, Microsoft became the first company to launch the Azure ND GB200 V6 VM series based on the NVIDIA GB200 Grace Blackwell Superchip, which features NVIDIA Grace CPUs and NVIDIA Blackwell GPUs.
Most recently, Google Cloud announced that it is bringing the Blackwell GPUs to Google Cloud with a preview of A4 VMs powered by NVIDIA HGX B200. Oracle also hosts Blackwell GPUs on its Zettascale cloud computing clusters.
DeepSeek Couldn’t Shake NVIDIA Yet
The launch of DeepSeek’s latest model, R1, which the company claims was trained on a $6 million budget, triggered a sharp market reaction. NVIDIA’s stock tumbled 17%, wiping out nearly $600 billion in value, driven by concerns over the model’s efficiency.
The model was launched in January, and its impact on compute demand may not be evident in the current Q4 results.
However, Huang said the company’s inference demand is accelerating, fuelled by test-time scaling and new reasoning models. “Models like OpenAI’s, Grok 3, and DeepSeek R1 are reasoning models that apply inference-time scaling. Reasoning models can consume 100 times more compute,” he said.
“DeepSeek-R1 has ignited global enthusiasm. It’s an excellent innovation. But even more importantly, it has open-sourced a world-class reasoning AI model.”
Experts speculate that the Chinese company may not be revealing the whole truth. During an interview, the CEO of Scale AI, Alexandr Wang, said that he believed DeepSeek possessed around 50,000 NVIDIA H100s, but wasn’t permitted to talk about it.
Notably, Elon Musk’s xAI used 200,000 GPUs to train Grok 3. According to reports, tech giant Meta Platforms is discussing constructing a new data centre campus for its AI projects, with potential costs exceeding $200 billion. Also, the U.S. government recently announced Project Stargate, a $500 billion AI infrastructure initiative backed by tech titans like Oracle, Softbank, and OpenAI.
These developments indicate a growing need for more compute in the future. Apple, too, recently announced a $500 billion investment in the United States over the next four years to build AI infrastructure. Although Apple does not use NVIDIA, its investment still reflects the broader direction of the industry.
Inference is Tough
NVIDIA will face increased competition from inference players like Groq, Cerebras, and SambaNova. Perplexity AI recently announced that its in-house LLM, Sonar, built on Llama 3.3 70B, now runs on Cerebras’ inference infrastructure.
French AI startup Mistral recently launched the Le Chat app for iOS and Android. According to the company, Le Chat is 10 times faster than GPT-4o, Claude Sonnet 3.5, and DeepSeek R1, thanks to Cerebras’ inference technology.
Similarly, in a recent interview, Groq founder Jonathan Ross said that NVIDIA dominates AI model training, and Groq sees no reason to compete in that space. Instead, they focus on faster and cheaper inference.
“They don’t offer fast tokens, and they don’t offer low-cost tokens. It’s a very different product. But what they do very, very well is training. They do it better than anyone else,” said Ross. He added that Grok’s chips cost more than 5x less than NVIDIA’s.
Ross argued that raw specs like teraflops per second are meaningless—what truly matters is tokens per dollar (cost efficiency) and tokens per watt (energy efficiency). Microsoft CEO Satya Nadella recently echoed a similar sentiment.
Yet, Microsoft has been sending mixed signals. A recent report revealed that the tech giant cancelled leases for significant data centre capacity in the US, raising concerns about the long-term sustainability of AI infrastructure investments.
What About China?
Following DeepSeek’s success, demand for NVIDIA GPUs has surged nationwide. A recent report states that Chinese companies are ramping up orders for NVIDIA’s H20 chip to support the growing demand for DeepSeek’s low-cost models.
During the earnings call, Huang said that China’s contribution to NVIDIA’s revenue has remained stable as a percentage of overall revenue compared to Q4 and previous quarters.
However, he acknowledged that China’s share has dropped to half of what it was before US export controls limited NVIDIA’s ability to sell high-end AI chips to Chinese companies. After the US imposed new export restrictions in October 2023, NVIDIA introduced the H20 as its main legally permitted chip for the Chinese market.