Nvidia CEO Jensen Huang arrives to attend the opening ceremony of Siliconware Precision Industries Co. (SPIL)’s Tan Ke Plant site in Taichung, Taiwan Jan. 16, 2025.
Ann Wang | Reuters
Nvidia announced new chips for building and deploying artificial intelligence models at its annual GTC conference on Tuesday.
CEO Jensen Huang revealed Blackwell Ultra, a family of chips shipping in the second half of this year, as well as Vera Rubin, the company’s next-generation graphics processing unit, or GPU, that is expected to ship in 2026.
Nvidia’s sales are up more than sixfold since its business was transformed by the release of OpenAI’s ChatGPT in late 2022. That’s because its “big GPUs” have most of the market for developing advanced AI, a process called training.
Software developers and investors are closely watching the company’s new chips to see if they offer enough additional performance and efficiency to convince the company’s biggest end customers — cloud companies including Microsoft, Google and Amazon — to continue spending billions of dollars to build data centers based around Nvidia chips.
“This last year is where almost the entire world got involved. The computational requirement, the scaling law of AI, is more resilient, and in fact, is hyper-accelerated,” Huang said.
Tuesday’s announcements are also a test of Nvidia’s new annual release cadence. The company is striving to announce new chip families on an every-year basis. Before the AI boom, Nvidia released new chip architectures every other year.
The GTC conference in San Jose, California, is also a show of strength for Nvidia.
The event, Nvidia’s second in-person conference since the pandemic, is expected to have 25,000 attendees and hundreds of companies discussing the ways they use the company’s hardware for AI. That includes Waymo, Microsoft and Ford, among others. General Motors also announced that it will use Nvidia’s service for its next-generation vehicles.
The chip architecture after Rubin will be named after physicist Richard Feynman, Nvidia said on Tuesday, continuing its tradition of naming chip families after scientists. Nvidia’s Feynman chips are expected to be available in 2028, according to a slide displayed by Huang.
Nvidia will also showcase its other products and services at the event.
For example, Nvidia announced new laptops and desktops using its chips, including two AI-focused PCs called DGX Spark and DGX Station that will be able to run large AI models such as Llama or DeepSeek. The company also announced updates to its networking parts for tying hundreds or thousands of GPUs together so they work as one, as well as a software package called Dynamo that helps users get the most out of their chips.
Jensen Huang, co-founder and chief executive officer of Nvidia Corp., speaks during the Nvidia GPU Technology Conference (GTC) in San Jose, California, US, on Tuesday, March 18, 2025.
David Paul Morris | Bloomberg | Getty Images
Vera Rubin
Nvidia expects to start shipping systems on its next-generation GPU family in the second half of 2026.
The system has two main components: a CPU, called Vera, and a new GPU design, called Rubin. It’s named after astronomer Vera Rubin.
Vera is Nvidia’s first custom CPU design, the company said, and it’s based on a core design they’ve named Olympus.
Previously when it needed CPUs, Nvidia used an off-the-shelf design from Arm. Companies that have developed custom Arm core designs, such as Qualcomm and Apple, say that they can be more tailored and unlock better performance.
The custom Vera design will be twice as fast as the CPU used in last year’s Grace Blackwell chips, the company said.
When paired with Vera, Rubin can manage 50 petaflops while doing inference, more than double the 20 petaflops for the company’s current Blackwell chips. Rubin can also support as much as 288 gigabytes of fast memory, which is one of the core specs that AI developers watch.
Nvidia is also making a change to what it calls a GPU. Rubin is actually two GPUs, Nvidia said.
The Blackwell GPU, which is currently on the market, is actually two separate chips that were assembled together and made to work as one chip.
Starting with Rubin, Nvidia will say that when it combines two or more dies to make a single chip, it will refer to them as separate GPUs. In the second half of 2027, Nvidia plans to release a “Rubin Next” chip that combines four dies to make a single chip, doubling the speed of Rubin, and it will refer to that as four GPUs.
Nvidia said that will come in a rack called Vera Rubin NVL144. Previous versions of Nvidia’s rack were called NVL72.
Jensen Huang, co-founder and chief executive officer of Nvidia Corp., speaks during the Nvidia GPU Technology Conference (GTC) in San Jose, California, US, on Tuesday, March 18, 2025.
David Paul Morris | Bloomberg | Getty Images
Blackwell Ultra
Nvidia also announced new versions of its Blackwell family of chips that it calls Blackwell Ultra.
That chip will be able to produce more tokens per second, which means that the chip can generate more content in the same amount of time as its predecessor, the company said in a briefing.
Nvidia says that means that cloud providers can use Blackwell Ultra to offer a premium AI service for time-sensitive applications, allowing them to make as much as 50 times the revenue from the new chips as the Hopper generation, which shipped in 2023.
Blackwell Ultra will come in a version with two paired to an Nvidia Arm CPU, called GB300, and a version with just the GPU, called B300. It will also come in versions with eight GPUs in a single server blade and a rack version with 72 Blackwell chips.
The top four cloud companies have deployed three times the number of Blackwell chips as Hopper chips, Nvidia said.
DeepSeek
China’s DeepSeek R1 model may have scared Nvidia investors when it was released in January, but Nvidia has embraced the software. The chipmaker will use the model to benchmark several of its new products.
Many AI observers said that DeepSeek’s model, which reportedly required fewer chips than models made in the U.S., threatened Nvidia’s business.
But Huang said earlier this year that DeepSeek was actually a good sign for Nvidia. That’s because DeepSeek uses a process called “reasoning,” which requires more computing power to provide users better answers.
The new Blackwell Ultra chips are better for reasoning models, Nvidia said.
It’s developed its chips to more efficiently do inference, so when new reasoning models require more computing power at the time of deployment, Nvidia’s chips will be able to handle it.
“In the last 2 to 3 years, a major breakthrough happened, a fundamental advance in artificial intelligence happened. We call it agentic AI,” Huang said. “It can reason about how to answer or how to solve a problem.”
Elon Musk on Monday said he does not support a merger between xAI and Tesla, as questions swirl over the future relationship of the electric automaker and artificial intelligence company.
X account @BullStreetBets_ posted an open question to Tesla investors on the social media site asking if they support a merger between Tesla and xAI. Musk responded with “No.”
The statement comes as the tech billionaire contemplates the future relationship between his multiple businesses.
Overnight, Musk suggested that Tesla will hold a shareholder vote at an unspecified time on whether the automaker should invest in xAI, the billionaire’s company that develops the controversial Grok AI chatbot.
Last year, Musk asked his followers in an poll on social media platform X whether Tesla should invest $5 billion into xAI. The majority voted “yes” at the time.
Musk has looked to bring his various businesses closer together. In March, Musk merged xAI and X together in a deal that valued the artificial intelligence company at $80 billion and the social media company at $33 billion.
Musk also said last week that xAI’s chatbot Grok will be available in Tesla vehicles. The chatbot has come under criticism recently, after praising Adolf Hitler and posting a barrage of antisemitic comments.
— CNBC’s Samantha Subin contributed to this report.
Coincidentally, OpenAI CEO Sam Altman announced early Saturday that there would be an indefinite delay of its first open-source model yet again due to safety concerns. OpenAI did not immediately respond to a CNBC request for comment on Kimi K2.
In its release announcement on social media platforms X and GitHub, Moonshot claimed Kimi K2 surpassed Claude Opus 4 on two benchmarks, and had better overall performance than OpenAI’s coding-focused GPT-4.1 model, based on several industry metrics.
“No doubt [Kimi K2 is] a globally competitive model, and it’s open sourced,” Wei Sun, principal analyst in artificial intelligence at Counterpoint, said in an email Monday.
Cheaper option
“On top of that, it has lower token costs, making it attractive for large-scale or budget-sensitive deployments,” she said.
The new K2 model is available via Kimi’s app and browser interface for free unlike ChatGPT or Claude, which charge monthly subscriptions for their latest AI models.
Kimi is also only charging 15 cents for every 1 million input tokens, and $2.50 per 1 million output tokens, according to its website. Tokens are a way of measuring data for AI model processing.
In contrast, Claude Opus 4 charges 100 times more for input — $15 per million tokens — and 30 times more for output — $75 per million tokens. Meanwhile, for every one million tokens, GPT-4.1 charges $2 for input and $8 for output.
Moonshot AI said on GitHub that developers can use K2 however they wish, with the only requirement that they display “Kimi K2” on the user interface if the commercial product or service has more than 100 million monthly active users, or makes the equivalent of $20 million in monthly revenue.
Hot AI market
Initial reviews of K2 on both English and Chinese social media have largely been positive, although there are some reports of hallucinations, a prevalent issue in generative AI, in which the models make up information.
Still, K2 is “the first model I feel comfortable using in production since Claude 3.5 Sonnet,” Pietro Schirano, founder of startup MagicPath that offers AI tools for design, said in a post on X.
Moonshot has open sourced some of its prior AI models. The company’s chatbot surged in popularity early last year as China’s alternative to ChatGPT, which isn’t officially available in the country. But similar chatbots from ByteDance and Tencent have since crowded the market, while tech giant Baidu has revamped its core search engine with AI tools.
Kimi’s latest AI release comes as investors eye Chinese alternatives to U.S. tech in the global AI competition.
Still, despite the excitement about DeepSeek, the privately-held company has yet to announce a major upgrade to its R1 and V3 model. Meanwhile, Manus AI, a Chinese startup that emerged earlier this year as another DeepSeek-type upstart, has relocated its headquarters to Singapore.
Over in the U.S., OpenAI also has yet to reveal GPT-5.
Work on GPT-5 may be taking up engineering resources, preventing OpenAI from progressing on its open-source model, Counterpoint’s Sun said, adding that it’s challenging to release a powerful open-source model without undermining the competitive advantage of a proprietary model.
“Kimi-Researcher represents a paradigm shift in agentic AI,” said Winston Ma, adjunct professor at NYU School of Law. He was referring to AI’s capability of simultaneously making several decisions on its own to complete a complex task.
“Instead of merely generating fluent responses, it demonstrates autonomous reasoning at an expert level — the kind of complex cognitive work previously missing from LLMs,” Ma said. He is also author of “The Digital War: How China’s Tech Power Shapes the Future of AI, Blockchain and Cyberspace.”
Co-founder and chief executive officer of Nvidia Corp., Jensen Huang attends the 9th edition of the VivaTech trade show in Paris on June 11, 2025.
Chesnot | Getty Images Entertainment | Getty Images
Nvidia CEO Jensen Huang has downplayed U.S. fears that his firm’s chips will aid the Chinese military, days ahead of another trip to the country as he attempts to walk a tightrope between Washington and Beijing.
In an interview with CNN aired Sunday, Huang said “we don’t have to worry about” China’s military using U.S.-made technology because “they simply can’t rely on it.”
“It could be limited at any time; not to mention, there’s plenty of computing capacity in China already,” Huang said. “They don’t need Nvidia’s chips, certainly, or American tech stacks in order to build their military,” he added.
The comments were made in reference to years of bipartisan U.S. policy that placed restrictions on semiconductor companies, prohibiting them from selling their most advanced artificial intelligence chips to clients in China.
Huang also repeated past criticisms of the policies, arguing that the tactic of export controls has been counterproductive to the ultimate goal of U.S. tech leadership.
“We want the American tech stack to be the global standard … in order for us to do that, we have to be in search of all the AI developers in the world,” Huang said, adding that half of the world’s AI developers are in China.
That means for America to be an AI leader, U.S. technology has to be available to all markets, including China, he added.
Washington’s latest restrictions on Nvidia’s sales to China were implemented in April and are expected to result in billions in losses for the company. In May, Huang said chip restrictions had already cut Nvidia’s China market share nearly in half.
Last week, the Nvidia CEO met with U.S. President Donald Trump, and was warned by U.S. lawmakers not to meet with companies connected to China’s military or intelligence bodies, or entities named on America’s restricted export list.
According to Daniel Newman, CEO of tech advisory firm The Futurum Group, Huang’s CNN interview exemplifies how Huang has been threading a needle between Washington and Beijing as it tries to maintain maximum market access.
“He needs to walk a proverbial tightrope to make sure that he doesn’t rattle the Trump administration,” Newman said, adding that he also wants to be in a position for China to invest in Nvidia technology if and when the policy provides a better climate to do so.
But that’s not to say that his downplaying of Washington’s concerns is valid, according to Newman. “I think it’s hard to completely accept the idea that China couldn’t use Nvidia’s most advanced technologies for military use.”
He added that he would expect Nvidia’s technology to be at the core of any country’s AI training, including for use in the development of advanced weaponry.
A U.S. official told Reuters last month that China’s large language model startup DeepSeek — which says it used Nvidia chips to train its models — was supporting China’s military and intelligence operations.
On Sunday, Huang acknowledged there were concerns about DeepSeek’s open-source R1 reasoning model being trained in China but said that there was no evidence that it presents dangers for that reason alone.
Huang complimented the R1 reasoning model, calling it “revolutionary,” and said its open-source nature has empowered startup companies, new industries, and countries to be able to engage in AI.
“The fact of the matter is, [China and the U.S.] are competitors, but we are highly interdependent, and to the extent that we can compete and both aspire to win, it is fine to respect our competitors,” he concluded.