Nvidia CEO Jensen Huang speaks during a press conference at The MGM during CES 2018 in Las Vegas on January 7, 2018.
Mandel Ngan | AFP | Getty Images
Software that can write passages of text or draw pictures that look like a human created them has kicked off a gold rush in the technology industry.
Companies like Microsoft and Google are fighting to integrate cutting-edge AI into their search engines, as billion-dollar competitors such as OpenAI and Stable Diffusion race ahead and release their software to the public.
Powering many of these applications is a roughly $10,000 chip that’s become one of the most critical tools in the artificial intelligence industry: The Nvidia A100.
The A100 has become the “workhorse” for artificial intelligence professionals at the moment, said Nathan Benaich, an investor who publishes a newsletter and report covering the AI industry, including a partial list of supercomputers using A100s. Nvidia takes 95% of the market for graphics processors that can be used for machine learning, according to New Street Research.
The A100 is ideally suited for the kind of machine learning models that power tools like ChatGPT, Bing AI, or Stable Diffusion. It’s able to perform many simple calculations simultaneously, which is important for training and using neural network models.
The technology behind the A100 was initially used to render sophisticated 3D graphics in games. It’s often called a graphics processor, or GPU, but these days Nvidia’s A100 is configured and targeted at machine learning tasks and runs in data centers, not inside glowing gaming PCs.
Big companies or startups working on software like chatbots and image generators require hundreds or thousands of Nvidia’s chips, and either purchase them on their own or secure access to the computers from a cloud provider.
Hundreds of GPUsare required to train artificial intelligence models, like large language models. The chips need to be powerful enough to crunch terabytes of data quickly to recognize patterns. After that, GPUs like the A100 are also needed for “inference,” or using the model to generate text, make predictions, or identify objects inside photos.
This means that AI companies need access to a lot of A100s. Some entrepreneurs in the space even see the number of A100s they have access to as a sign of progress.
“A year ago we had 32 A100s,” Stability AI CEO Emad Mostaque wrote on Twitter in January. “Dream big and stack moar GPUs kids. Brrr.” Stability AI is the company that helped develop Stable Diffusion, an image generator that drew attention last fall, and reportedly has a valuation of over $1 billion.
Now, Stability AI has access to over 5,400 A100 GPUs, according to one estimate from the State of AI report, which charts and tracks which companies and universities have the largest collection of A100 GPUs — although it doesn’t include cloud providers, which don’t publish their numbers publicly.
Nvidia’s riding the A.I. train
Nvidia stands to benefit from the AI hype cycle. During Wednesday’s fiscal fourth-quarter earnings report, although overall sales declined 21%, investors pushed the stock up about 14% on Thursday, mainly because the company’s AI chip business — reported as data centers — rose by 11% to more than $3.6 billion in sales during the quarter, showing continued growth.
Nvidia shares are up 65% so far in 2023, outpacing the S&P 500 and other semiconductor stocks alike.
Nvidia CEO Jensen Huang couldn’t stop talking about AI on a call with analysts on Wednesday, suggesting that the recent boom in artificial intelligence is at the center of the company’s strategy.
“The activity around the AI infrastructure that we built, and the activity around inferencing using Hopper and Ampere to influence large language models has just gone through the roof in the last 60 days,” Huang said. “There’s no question that whatever our views are of this year as we enter the year has been fairly dramatically changed as a result of the last 60, 90 days.”
Ampere is Nvidia’s code name for the A100 generation of chips. Hopper is the code name for the new generation, including H100, which recently started shipping.
More computers needed
Nvidia A100 processor
Nvidia
Compared to other kinds of software, like serving a webpage, which uses processing power occasionally in bursts for microseconds, machine learning tasks can take up the whole computer’s processing power, sometimes for hours or days.
This means companies that find themselves with a hit AI product often need to acquire more GPUs to handle peak periods or improve their models.
These GPUs aren’t cheap. In addition to a single A100 on a card that can be slotted into an existing server, many data centers use a system that includes eight A100 GPUs working together.
This system, Nvidia’s DGX A100, has a suggested price of nearly $200,000, although it comes with the chips needed. On Wednesday, Nvidia said it would sell cloud access to DGX systems directly, which will likely reduce the entry cost for tinkerers and researchers.
It’s easy to see how the cost of A100s can add up.
For example, an estimate from New Street Research found that the OpenAI-based ChatGPT model inside Bing’s search could require 8 GPUs to deliver a response to a question in less than one second.
At that rate, Microsoft would need over 20,000 8-GPU servers just to deploy the model in Bing to everyone, suggesting Microsoft’s feature could cost $4 billion in infrastructure spending.
“If you’re from Microsoft, and you want to scale that, at the scale of Bing, that’s maybe $4 billion. If you want to scale at the scale of Google, which serves 8 or 9 billion queries every day, you actually need to spend $80 billion on DGXs.” said Antoine Chkaiban, a technology analyst at New Street Research. “The numbers we came up with are huge. But they’re simply the reflection of the fact that every single user taking to such a large language model requires a massive supercomputer while they’re using it.”
The latest version of Stable Diffusion, an image generator, was trained on 256 A100 GPUs, or 32 machines with 8 A100s each, according to information online posted by Stability AI, totaling 200,000 compute hours.
At the market price, training the model alone cost $600,000, Stability AI CEO Mostaque said on Twitter, suggesting in a tweet exchange the price was unusually inexpensive compared to rivals. That doesn’t count the cost of “inference,” or deploying the model.
Huang, Nvidia’s CEO, said in an interview with CNBC’s Katie Tarasov that the company’s products are actually inexpensive for the amount of computation that these kinds of models need.
“We took what otherwise would be a $1 billion data center running CPUs, and we shrunk it down into a data center of $100 million,” Huang said. “Now, $100 million, when you put that in the cloud and shared by 100 companies, is almost nothing.”
Huang said that Nvidia’s GPUs allow startups to train models for a much lower cost than if they used a traditional computer processor.
“Now you could build something like a large language model, like a GPT, for something like $10, $20 million,” Huang said. “That’s really, really affordable.”
New competition
Nvidia isn’t the only company making GPUs for artificial intelligence uses. AMD and Intel have competing graphics processors, and big cloud companies like Google and Amazon are developing and deploying their own chips specially designed for AI workloads.
Still, “AI hardware remains strongly consolidated to NVIDIA,” according to the State of AI compute report. As of December, more than 21,000 open-source AI papers said they used Nvidia chips.
Most researchersincluded in the State of AI Compute Index used the V100, Nvidia’s chip that came out in 2017, but A100 grew fast in 2022 to be the third-most used Nvidia chip, just behind a $1500-or-less consumer graphics chip originally intended for gaming.
The A100 also has the distinction of being one of only a few chips to have export controls placed on it because of national defense reasons. Last fall, Nvidia said in an SEC filing that the U.S. government imposed a license requirement barring the export of the A100 and the H100 to China, Hong Kong, and Russia.
“The USG indicated that the new license requirement will address the risk that the covered products may be used in, or diverted to, a ‘military end use’ or ‘military end user’ in China and Russia,” Nvidia said in its filing. Nvidia previously said it adapted some of its chips for the Chinese market to comply with U.S. export restrictions.
The fiercest competition for the A100 may be its successor. The A100 was first introduced in 2020, an eternity ago in chip cycles. The H100, introduced in 2022, is starting to be produced in volume — in fact, Nvidia recorded more revenue from H100 chips in the quarter ending in January than the A100, it said on Wednesday, although the H100 is more expensive per unit.
The H100, Nvidia says, is the first one of its data center GPUs to be optimized for transformers, an increasingly important technique that many of the latest and top AI applications use. Nvidia said on Wednesday that it wants to make AI training over 1 million percent faster. That could mean that, eventually, AI companies wouldn’t need so many Nvidia chips.
Microsoft owns lots of Nvidia graphics processing units, but it isn’t using them to develop state-of-the-art artificial intelligence models.
There are good reasons for that position, Mustafa Suleyman, the company’s CEO of AI, told CNBC’s Steve Kovach in an interview on Friday. Waiting to build models that are “three or six months behind” offers several advantages, including lower costs and the ability to concentrate on specific use cases, Suleyman said.
It’s “cheaper to give a specific answer once you’ve waited for the first three or six months for the frontier to go first. We call that off-frontier,” he said. “That’s actually our strategy, is to really play a very tight second, given the capital-intensiveness of these models.”
Suleyman made a name for himself as a co-founder of DeepMind, the AI lab that Google bought in 2014, reportedly for $400 million to $650 million. Suleyman arrived at Microsoft last year alongside other employees of the startup Inflection, where he had been CEO.
More than ever, Microsoft counts on relationships with other companies to grow.
It gets AI models from San Francisco startup OpenAI and supplemental computing power from newly public CoreWeave in New Jersey. Microsoft has repeatedly enriched Bing, Windows and other products with OpenAI’s latest systems for writing human-like language and generating images.
Microsoft’s Copilot will gain “memory” to retain key facts about people who repeatedly use the assistant, Suleyman said Friday at an event in Microsoft’s Redmond, Washington, headquarters to commemorate the company’s 50th birthday. That feature came first to OpenAI’s ChatGPT, which has 500 million weekly users.
Through ChatGPT, people can access top-flight large language models such as the o1 reasoning model that takes time before spitting out an answer. OpenAI introduced that capability in September — only weeks later did Microsoft bring a similar capability called Think Deeper to Copilot.
Microsoft occasionally releases open-source small-language models that can run on PCs. They don’t require powerful server GPUs, making them different from OpenAI’s o1.
OpenAI and Microsoft have held a tight relationship shortly after the startup launched its ChatGPT chatbot in late 2022, effectively kicking off the generative AI race. In total, Microsoft has invested $13.75 billion in the startup, but more recently, fissures in the relationship between the two companies have begun to show.
Microsoft added OpenAI to its list of competitors in July 2024, and OpenAI in January announced that it was working with rival cloud provider Oracle on the $500 billion Stargate project. That came after years of OpenAI exclusively relying on Microsoft’s Azure cloud. Despite OpenAI partnering with Oracle, Microsoft in a blog post announced that the startup had “recently made a new, large Azure commitment.”
“Look, it’s absolutely mission-critical that long-term, we are able to do AI self-sufficiently at Microsoft,” Suleyman said. “At the same time, I think about these things over five and 10 year periods. You know, until 2030 at least, we are deeply partnered with OpenAI, who have [had an] enormously successful relationship for us.
Microsoft is focused on building its own AI internally, but the company is not pushing itself to build the most cutting-edge models, Suleyman said.
“We have an incredibly strong AI team, huge amounts of compute, and it’s very important to us that, you know, maybe we don’t develop the absolute frontier, the best model in the world first,” he said. “That’s very, very expensive to do and unnecessary to cause that duplication.”
President Trump’s new tariffs on goods that the U.S. imports from over 100 countries will have an effect on consumers, former Microsoft CEO Steve Ballmer told CNBC on Friday. Investors will feel the pain, too.
Microsoft’s stock dropped almost 6% in the past two days, as the Nasdaq wrapped up its worst week in five years.
“As a Microsoft shareholder, this kind of thing is not good,” Ballmer said, in an interview with Andrew Ross Sorkin that was tied to Microsoft’s 50th anniversary celebration. “It creates opportunity to be a serious, long-term player.”
Ballmer was sandwiched in between Microsoft co-founder Bill Gates and current CEO Satya Nadella for the interview.
“I took just enough economics in college — that tariffs are actually going to bring some turmoil,” said Ballmer, who was succeeded by Nadella in 2014. Gates, Microsoft’s first CEO, convinced Ballmer to join the company in 1980.
Gates, Ballmer and Nadella attended proceedings at Microsoft’s Redmond, Washington, campus on Friday to celebrate its first half-century.
Between the tariffs and weak quarterly revenue guidance announced in January, Microsoft’s stock is on track for its fifth straight month of declines, which would be the worst stretch since 2009. But the company remains a leader in the PC operating system and productivity software markets, and its partnership with startup OpenAI has led to gains in cloud computing.
“I think that disruption is very hard on people, and so the decision to do something for which disruption was inevitable, that needs a lot of popular support, and nobody could game theorize exactly who is going to do what in response,” Ballmer said, regarding the tariffs. “So, I think citizens really like stability a lot. And I hope people — individuals who will feel this, because people are feeling it, not just the stock market, people are going to feel it.”
Ballmer, who owns the Los Angeles Clippers, is among Microsoft’s biggest fans. He said he’s the company’s largest investor. In 2014, shortly after he bought the basketball team for $2 billion, he held over 333 million shares of the stock, according to a regulatory filing.
“I’m not going to probably have 50 more years on the planet,” he said. “But whatever minutes I have, I’m gonna be a large Microsoft shareholder.” He said there’s a bright future for computing, storage and intelligence. Microsoft launched the first Azure services while Ballmer was CEO.
Earlier this week Bloomberg reported that Microsoft, which pledged to spend $80 billion on AI-enabled data center infrastructure in the current fiscal year, has stopped discussions or pushed back the opening of facilities in the U.S. and abroad.
JPMorgan Chase’s chief economist, Bruce Kasman, said in a Thursday note that the chance of a global recession will be 60% if Trump’s tariffs kick in as described. His previous estimate was 40%.
“Fifty years from now, or 25 years from now, what is the one thing you can be guaranteed of, is the world needs more compute,” Nadella said. “So I want to keep those two thoughts and then take one step at a time, and then whatever are the geopolitical or economic shifts, we’ll adjust to it.”
Gates, who along with co-founder Paul Allen, sought to build a software company rather than sell both software and hardware, said he wasn’t sure what the economic effects of the tariffs will be. Today, most of Microsoft’s revenue comes from software. It also sells Surface PCs and Xbox consoles.
“So far, it’s just on goods, but you know, will it eventually be on services? Who knows?” said Gates, who reportedly donated around $50 million to a nonprofit that supported Democratic nominee Kamala Harris’ losing campaign.
AppLovin CEO Adam Foroughi provided more clarity on the ad-tech company’s late-stage effort to acquire TikTok, calling his offer a “much stronger bid than others” on CNBC’s The Exchange Friday afternoon.
Foroughi said the company is proposing a merger between AppLovin and the entire global business of TikTok, characterizing the deal as a “partnership” where the Chinese could participate in the upside while AppLovin would run the app.
“If you pair our algorithm with the TikTok audience, the expansion on that platform for dollars spent will be through the roof,” Foroughi said.
The news comes as President Trump announced he would extend the deadline a second time for TikTok’s Chinese-owned parent company ByteDance to sell the U.S. subsidiary of TikTok to an American buyer or face an effective ban on U.S. app stores. The new deadline is now in June, which, as Foroughi described, “buys more time to put the pieces together” on AppLovin’s bid.
“The president’s a great dealmaker — we’re proposing, essentially an enhancement to the deal that they’ve been working on, but a bigger version of all the deals contemplated,” he added.
AppLovin faces a crowded field of other interested U.S. backers, including Amazon, Oracle, billionaire Frank McCourt and his Project Liberty consortium, and numerous private equity firms. Some proposals reportedly structure the deal to give a U.S. buyer 50% ownership of the company, rather than a complete acquisition. The Chinese government will still need to approve the deal, and AppLovin’s interest in purchasing TikTok in “all markets outside of China” is “preliminary,” according to an April 3 SEC filing.
Correction: A prior version of this story incorrectly characterized China’s ongoing role in TikTok should AppLovin acquire the app.