OpenAI CEO Sam Altman speaks during a keynote address announcing ChatGPT integration for Bing at Microsoft in Redmond, Washington, on February 7, 2023.
Jason Redmond | AFP | Getty Images
Before OpenAI’s ChatGPT emerged and captured the world’s attention for its ability to create compelling sentences, a small startup called Latitude was wowing consumers with its AI Dungeon game that let them use artifical intelligence to create fantastical tales based on their prompts.
But as AI Dungeon became more popular, Latitude CEO Nick Walton recalled that the cost to maintain the text-based role-playing game began to skyrocket. Powering AI Dungeon’s text-generation software was the GPT language technology offered by the Microsoft-backed artificial intelligence research lab OpenAI. The more people played AI Dungeon, the bigger the bill Latitude had to pay OpenAI.
Compounding the predicament was that Walton also discovered content marketers were using AI Dungeon to generate promotional copy, a use for AI Dungeon that his team never foresaw, but that ended up adding to the company’s AI bill.
At its peak in 2021, Walton estimates Latitude was spending nearly $200,000 a month on OpenAI’s so-called generative AI software and Amazon Web Services in order to keep up with the millions of user queries it needed to process each day.
“We joked that we had human employees and we had AI employees, and we spent about as much on each of them,” Walton said. “We spent hundreds of thousands of dollars a month on AI and we are not a big startup, so it was a very massive cost.”
By the end of 2021, Latitude switched from using OpenAI’s GPT software to a cheaper but still capable language software offered by startup AI21 Labs, Walton said, adding that the startup also incorporated open source and free language models into its service to lower the cost. Latitude’s generative AI bills have dropped to under $100,000 a month, Walton said, and the startup charges players a monthly subscription for more advanced AI features to help reduce the cost.
Latitude’s pricey AI bills underscore an unpleasant truth behind the recent boom in generative AI technologies: The cost to develop and maintain the software can be extraordinarily high, both for the firms that develop the underlying technologies, generally referred to as a large language or foundation models, and those that use the AI to power their own software.
The high cost of machine learning is an uncomfortable reality in the industry as venture capitalists eye companies that could potentially be worth trillions, and big companies such as Microsoft, Meta, and Google use their considerable capital to develop a lead in the technology that smaller challengers can’t catch up to.
But if the margin for AI applications is permanently smaller than previous software-as-a-service margins, because of the high cost of computing, it could put a damper on the current boom.
The high cost of training and “inference” — actually running — large language models is a structural cost that differs from previous computing booms. Even when the software is built, or trained, it still requires a huge amount of computing power to run large language models because they do billions of calculations every time they return a response to a prompt. By comparison, serving web apps or pages requires much less calculation.
These calculations also require specialized hardware. While traditional computer processors can run machine learning models, they’re slow. Most training and inference now takes place on graphics processors, or GPUs, which were initially intended for 3D gaming, but have become the standard for AI applications because they can do many simple calculations simultaneously.
Nvidia makes most of the GPUs for the AI industry, and its primary data center workhorse chip costs $10,000. Scientists that build these models often joke that they “melt GPUs.”
Training models
Nvidia A100 processor
Nvidia
Analysts and technologists estimate that the critical process of training a large language model such as GPT-3 could cost more than $4 million. More advanced language models could cost over “the high-single digit-millions” to train, said Rowan Curran, a Forrester analyst who focuses on AI and machine learning.
Meta’s largest LLaMA model released last month, for example, used 2,048 Nvidia A100 GPUs to train on 1.4 trillion tokens (750 words is about 1,000 tokens), taking about 21 days, the company said when it released the model last month.
It took about 1 million GPU hours to train. With dedicated prices from AWS, that would cost over $2.4 million. And at 65 billion parameters, it’s smaller than the current GPT models at OpenAI, like ChatGPT-3, which has 175 billion parameters.
Clement Delangue, the CEO of AI startup Hugging Face, said the process of training the company’s Bloom large language model took more than two-and-a-half months and required access to a supercomputer that was “something like the equivalent of 500 GPUs.”
Organizations that build large language models must be cautious when they retrain the software, which helps the software improve its abilities, because it costs so much, he said.
“It’s important to realize that these models are not trained all the time, like every day,” Delangue said, noting that’s why some models, like ChatGPT, don’t have knowledge of recent events. ChatGPT’s knowledge stops in 2021, he said.
“We are actually doing a training right now for the version two of Bloom and it’s gonna cost no more than $10 million to retrain,” Delangue said. “So that’s the kind of thing that we don’t want to do every week.”
Inference and who pays for it
Bing with Chat
Jordan Novet | CNBC
To use a trained machine learning model to make predictions or generate text, engineers use the model in a process called “inference,” which can be much more expensive than training because it might need to run millions of times for a popular product.
For a product as popular as ChatGPT — which investment firm UBS estimates to have reached 100 million monthly active users in January — Curran believes that it could have cost OpenAI $40 million to process the millions of prompts people fed into the software that month.
Costs skyrocket when these tools are used billions of times a day. Financial analysts estimate Microsoft’s Bing AI chatbot, which is powered by an OpenAI ChatGPT model, needs at least $4 billion of infrastructure to serve responses to all Bing users.
In the case of Latitude, for instance, while the startup didn’t have to pay to train the underlying OpenAI language model it was accessing, it had to account for the inferencing costs that were something akin to “half-a-cent per call” on “a couple million requests per day,” a Latitude spokesperson said.
“And I was being relatively conservative,” Curran said of his calculations.
In order to sow the seeds of the current AI boom, venture capitalists and tech giants have been investing billions of dollars into startups that specialize in generative AI technologies. Microsoft, for instance, invested as much as $10 billion into GPT’s overseer OpenAI, according to media reports in January. Salesforce‘s venture capital arm, Salesforce Ventures, recently debuted a $250 million fund that caters to generative AI startups.
As investor Semil Shah of the VC firms Haystack and Lightspeed Venture Partners described on Twitter, “VC dollars shifted from subsidizing your taxi ride and burrito delivery to LLMs and generative AI compute.”
Many entrepreneurs see risks in relying on potentially subsidized AI models that they don’t control and merely pay for on a per-use basis.
“When I talk to my AI friends at the startup conferences, this is what I tell them: Do not solely depend on OpenAI, ChatGPT or any other large language models,” said Suman Kanuganti, founder of personal.ai, a chatbot currently in beta mode. “Because businesses shift, they are all owned by big tech companies, right? If they cut access, you’re gone.”
Companies such as enterprise tech firm Conversica are exploring how they can use the tech through Microsoft’s Azure cloud service at its currently discounted price.
While Conversica CEO Jim Kaskade declined to comment about how much the startup is paying, he conceded that the subsidized cost is welcome as it explores how language models can be used effectively.
“If they were truly trying to break even, they’d be charging a hell of a lot more,” Kaskade said.
How it could change
It’s unclear if AI computation will stay expensive as the industry develops. Companies making the foundation models, semiconductor makers and startups all see business opportunities in reducing the price of running AI software.
Nvidia, which has about 95% of the market for AI chips, continues to develop more powerful versions designed specifically for machine learning, but improvements in total chip power across the industry have slowed in recent years.
Still, Nvidia CEO Jensen Huang believes that in 10 years, AI will be “a million times” more efficient because of improvements not only in chips, but also in software and other computer parts.
“Moore’s Law, in its best days, would have delivered 100x in a decade,” Huang said last month on an earnings call. “By coming up with new processors, new systems, new interconnects, new frameworks and algorithms, and working with data scientists, AI researchers on new models, across that entire span, we’ve made large language model processing a million times faster.”
Some startups have focused on the high cost of AI as a business opportunity.
“Nobody was saying ‘You should build something that was purpose-built for inference.’ What would that look like?” said Sid Sheth, founder of D-Matrix, a startup building a system to save money on inference by doing more processing in the computer’s memory, as opposed to on a GPU.
“People are using GPUs today, NVIDIA GPUs, to do most of their inference. They buy the DGX systems that NVIDIA sells that cost a ton of money. The problem with inference is if the workload spikes very rapidly, which is what happened to ChatGPT, it went to like a million users in five days. There is no way your GPU capacity can keep up with that because it was not built for that. It was built for training, for graphics acceleration,” he said.
Delangue, the HuggingFace CEO, believes more companies would be better served focusing on smaller, specific models that are cheaper to train and run, instead of the large language models that are garnering most of the attention.
Meanwhile, OpenAI announced last month that it’s lowering the cost for companies to access its GPT models. It now charges one-fifth of one cent for about 750 words of output.
OpenAI’s lower prices have caught the attention of AI Dungeon-maker Latitude.
“I think it’s fair to say that it’s definitely a huge change we’re excited to see happen in the industry and we’re constantly evaluating how we can deliver the best experience to users,” a Latitude spokesperson said. “Latitude is going to continue to evaluate all AI models to be sure we have the best game out there.”
Shares of advertising technology company AppLovin and stock trading app Robinhood Markets each jumped about 7% in extended trading on Friday after S&P Global said the two will join the S&P 500 index.
The changes will go into effect before the beginning of trading on Sept. 22, S&P Global announced in a statement. AppLovin will replace MarketAxess Holdings, while Robinhood will take the place of Caesars Entertainment.
In March, short-seller Fuzzy Panda Research advised the committee for the large-cap U.S. index to keep AppLovin from becoming a constituent. AppLovin shares dropped 15% in December, when the committee picked Workday to join the S&P 500. Robinhood, for its part, saw shares slip 2% in June when it was excluded from a quarterly rebalancing of the index.
It’s normal for stocks to go up on news of their inclusion in a major index such as the S&P 500. Fund managers need to buy shares to reflect the updates.
Read more CNBC tech news
AppLovin and Robinhood both went public on Nasdaq in 2021.
Robinhood has been a favorite among retail investors who have bid up shares of meme stocks such as AMC Entertainment and GameStop.
AppLovin itself became a stock to watch, with shares gaining 278% in 2023 and over 700% in 2024. As of Friday’s close, the stock had gained only 51% so far in 2025. AppLovin’s software brings targeted ads to mobile apps and games.
Earlier this year, AppLovin offered to buy the U.S. TikTok business from China’s ByteDance. U.S. President Donald Trump has repeatedly extended the deadline for a sale, most recently in June.
At Robinhood’s annual general meeting in June, a shareholder asked Vlad Tenev, the company’s co-founder and CEO, if there were plans for getting into the S&P 500.
“It’s a difficult thing to plan for,” Tenev said. “I think it’s one of those things that hopefully happens.”
He said he believed the company was eligible.
Shares of MarketAxess, which specializes in fixed-income trading, have fallen 17% year to date, while shares of Caesars, which runs hotels and casinos, are down 21%.
U.S. Federal Trade Commission Commissioner Rebecca Slaughter raised questions on Friday about the status of an artificial intelligence chatbot complaint against Snap that the agency referred to the Department of Justice earlier this year.
In January, the FTC announced that it would refer a non-public complaint regarding allegations that Snap’s My AI chatbot posed potential “risks and harms” to young users and said it would refer the suit to the DOJ “in the public interest.”
“We don’t know what has happened to that complaint,” Slaughter said on CNBC’s ‘The Exchange.” “The public does not know what has happened to that complaint, and that’s the kind of thing that I think people deserve answers on.”
Snap’s My AI chatbot, which debuted in 2023, is powered by large language models from OpenAI and Google and has drawn scrutiny for problematic responses.
The DOJ did not immediately respond to a request for comment. Snap declined to comment.
Slaugther’s comments came a day after President Donald Trump held a White House dinner with several tech executives, including Google CEO Sundar Pichai, Meta CEO Mark Zuckerberg and Apple CEO Tim Cook.
Read more CNBC tech news
“The president is hosting Big Tech CEOs in the White House even as we’re reading about truly horrifying reports of chatbots engaging with small children,” she said.
Trump has been attempting to remove Slaughter from her FTC position, but earlier this week, U.S. appeals court allowed her to maintain her role.
On Thursday, the president asked the Supreme Court to allow him to fire her from the post.
FTC Chair Andrew Ferguson, who was selected by Trump to lead the commission, publicly opposed the complaint against Snap in January, prior to succeeding Lina Khan at the helm.
At the time, he said he would “release a more detailed statement about this affront to the Constitution and the rule of law” if the DOJ were to eventually file a complaint.
Alphabet and Google CEO Sundar Pichai meets with Polish Prime Minister Donald Tusk at Google for Startups in Warsaw, Poland, on February 13, 2025.
Klaudia Radecka | Nurphoto | Getty Images
From the courtroom to the boardroom, it was a big week for tech investors.
The resolution of Google’s antitrust case led to sharp rallies for Alphabet and Apple. Broadcom shareholders cheered a new $10 billion customer. And Tesla’s stock was buoyed by a freshly proposed pay package for CEO Elon Musk.
Add it up, and the U.S. tech industry’s eight trillion-dollar companies gained a combined $420 billion in market cap this week, lifting their total value to $21 trillion, despite a slide in Nvidia shares.
Those companies now account for roughly 36% of the S&P 500, a proportion so great by historical standards that Howard Silverblatt, senior index analyst at S&P Dow Jones Indices, told CNBC by email, “there are no comparisons.”
Read more CNBC tech news
There was a certain irony to this week’s gains.
Alphabet’s 9% jump on Wednesday was directly tied to the U.S. government effort to diminish the search giant’s market control, which was part of a years-long campaign to break up Big Tech. Since 2020, Google, Apple, Amazon and Meta have all been hit with antitrust allegations by the Department of Justice or Federal Trade Commission.
A year ago, Google lost to the DOJ, a result viewed by many as the most-significant antitrust decision for the tech industry since the case against Microsoft more than two decades earlier. But in the remedies ruling this week, U.S. District Judge Amit Mehta said Google won’t be forced to sell its Chrome browser despite its loss in court and instead handed down a more limited punishment, including a requirement to share search data with competitors.
The decision lifted Apple along with Alphabet, because the companies can stick with an arrangement that involves Google paying Applebillions of dollars per year to be the default search engine on iPhones. Alphabet rose more than 10% for the week and Apple added 3.2%, helping boost the Nasdaq 1.1%.
Analysts at Wedbush Securities wrote in a note after the decision that the ruling “removed a huge overhang” on Google’s stock and a “black cloud worry” that hung over Apple. Further, they said it clears the path for the companies to pursue a bigger artificial intelligence deal involving Gemini, Google’s AI models.
“This now lays the groundwork for Apple to continue its deal and ultimately likely double down on more AI related partnerships with Google Gemini down the road,” the analysts wrote.
Mehta explained that a major factor in his decision was the emergence of generative AI, which has become a much more competitive market than traditional search and has dramatically changed the market dynamics.
New players like OpenAI, Anthropic and Perplexity have altered Google’s dominance, Mehta said, noting that generative AI technologies “may yet prove to be game changers.”
On Friday, Alphabet investors shrugged off a separate antitrust matter out of Europe. The company was hit with a 2.95-billion-euro ($3.45 billion) fine from European Union regulators for anti-competitive practices in its advertising technology business.
Broadcom pops
While OpenAI was an indirect catalyst for Google and Apple this week, it was more directly tied to the huge rally in Broadcom’s stock.
Following Broadcom’s better-than-expected earnings report on Thursday, CEO Hock Tan told analysts that his chipmaker had secured a $10 billion contract with a new customer, which would be the company’s fourth large AI client.
Several analysts said the new customer is OpenAI, and the Financial Times reported on a partnership between the two companies.
Broadcom is the newest entrant into the trillion-dollar club, thanks to the company’s custom chips for AI, already used by Google, Meta and TikTok parent ByteDance. With Its 13% jump this week, the stock is now up 120% in the past year, lifting Broadcom’s market cap to around $1.6 trillion.
“The company is firing on all cylinders with clear line of sight for growth supported by significant backlog,” analysts at Barclays wrote in a note, maintaining their buy recommendation and lifting their price target on the stock.
For the other giant AI chipmaker, the past week wasn’t so good.
Nvidia shares fell more than 4% in the holiday-shortened week, the worst performance among the megacaps. There was no apparent negative news for Nvidia, but the stock has now dropped for four consecutive weeks.
Still, Nvidia remains the largest company by market cap, valued at over $4 trillion, with its stock up 56% in the past 12 months.
Microsoft also fell this week and is on an extended slide, dropping for five straight weeks. Shares are still up 21% over the last 12 months.
On the flipside, Tesla has been the laggard in the group. Shares of the electric vehicle maker are down 13% this year due to a multi-quarter sales slump that reflects rising competition from lower-cost Chinese manufacturers and an aging lineup of EVs.
But Tesla shares climbed 5% this week, sparked mostly by gains on Friday after the company said it wants investors to approve a pay plan for Musk that could be worth up to almost $1 trillion.
The payouts, split into 12 tranches, would require Tesla to see significant value appreciation, starting with the first award that won’t kick in until the company almost doubles its market cap to $2 trillion.
Tesla Chairwoman Robyn Denholm told CNBC’s Andrew Ross Sorkin the plan was designed to keep Musk, the world’s richest person, “motivated and focused on delivering for the company.”