OpenAI CEO Sam Altman speaks during a keynote address announcing ChatGPT integration for Bing at Microsoft in Redmond, Washington, on February 7, 2023.
Jason Redmond | AFP | Getty Images
Before OpenAI’s ChatGPT emerged and captured the world’s attention for its ability to create compelling sentences, a small startup called Latitude was wowing consumers with its AI Dungeon game that let them use artificial intelligence to create fantastical tales based on their prompts.
But as AI Dungeon became more popular, Latitude CEO Nick Walton recalled that the cost to maintain the text-based role-playing game began to skyrocket. AI Dungeon’s text-generation software was powered by the GPT language technology offered by the Microsoft-backed AI research lab OpenAI. The more people played AI Dungeon, the bigger the bill Latitude had to pay OpenAI.
Compounding the predicament was that Walton also discovered content marketers were using AI Dungeon to generate promotional copy, a use for AI Dungeon that his team never foresaw, but that ended up adding to the company’s AI bill.
At its peak in 2021, Walton estimates Latitude was spending nearly $200,000 a month on OpenAI’s so-called generative AI software and Amazon Web Services in order to keep up with the millions of user queries it needed to process each day.
“We joked that we had human employees and we had AI employees, and we spent about as much on each of them,” Walton said. “We spent hundreds of thousands of dollars a month on AI and we are not a big startup, so it was a very massive cost.”
By the end of 2021, Latitude switched from using OpenAI’s GPT software to a cheaper but still capable language software offered by startup AI21 Labs, Walton said, adding that the startup also incorporated open source and free language models into its service to lower the cost. Latitude’s generative AI bills have dropped to under $100,000 a month, Walton said, and the startup charges players a monthly subscription for more advanced AI features to help reduce the cost.
Latitude’s pricey AI bills underscore an unpleasant truth behind the recent boom in generative AI technologies: The cost to develop and maintain the software can be extraordinarily high, both for the firms that develop the underlying technologies, generally referred to as a large language or foundation models, and those that use the AI to power their own software.
The high cost of machine learning is an uncomfortable reality in the industry as venture capitalists eye companies that could potentially be worth trillions, and big companies such as Microsoft, Meta, and Google use their considerable capital to develop a lead in the technology that smaller challengers can’t catch up to.
But if the margin for AI applications is permanently smaller than previous software-as-a-service margins, because of the high cost of computing, it could put a damper on the current boom.
The high cost of training and “inference” — actually running — large language models is a structural cost that differs from previous computing booms. Even when the software is built, or trained, it still requires a huge amount of computing power to run large language models because they do billions of calculations every time they return a response to a prompt. By comparison, serving web apps or pages requires much less calculation.
These calculations also require specialized hardware. While traditional computer processors can run machine learning models, they’re slow. Most training and inference now takes place on graphics processors, or GPUs, which were initially intended for 3D gaming, but have become the standard for AI applications because they can do many simple calculations simultaneously.
Nvidia makes most of the GPUs for the AI industry, and its primary data center workhorse chip costs $10,000. Scientists that build these models often joke that they “melt GPUs.”
Training models
Nvidia A100 processor
Nvidia
Analysts and technologists estimate that the critical process of training a large language model such as OpenAI’s GPT-3 could cost more than $4 million. More advanced language models could cost over “the high-single-digit millions” to train, said Rowan Curran, a Forrester analyst who focuses on AI and machine learning.
Meta’s largest LLaMA model released last month, for example, used 2,048 Nvidia A100 GPUs to train on 1.4 trillion tokens (750 words is about 1,000 tokens), taking about 21 days, the company said when it released the model last month.
It took about 1 million GPU hours to train. With dedicated prices from AWS, that would cost over $2.4 million. And at 65 billion parameters, it’s smaller than the current GPT models at OpenAI, like ChatGPT-3, which has 175 billion parameters.
Clement Delangue, the CEO of AI startup Hugging Face, said the process of training the company’s Bloom large language model took more than two-and-a-half months and required access to a supercomputer that was “something like the equivalent of 500 GPUs.”
Organizations that build large language models must be cautious when they retrain the software, which helps improve its abilities, because it costs so much, he said.
“It’s important to realize that these models are not trained all the time, like every day,” Delangue said, noting that’s why some models, such as ChatGPT, don’t have knowledge of recent events. ChatGPT’s knowledge stops in 2021, he said.
“We are actually doing a training right now for the version two of Bloom and it’s gonna cost no more than $10 million to retrain,” Delangue said. “So that’s the kind of thing that we don’t want to do every week.”
Inference and who pays for it
Bing with Chat
Jordan Novet | CNBC
To use a trained machine learning model to make predictions or generate text, engineers use the model in a process called “inference,” which can be much more expensive than training because it might need to run millions of times for a popular product.
For a product as popular as ChatGPT — which investment firm UBS estimates to have reached 100 million monthly active users in January — Curran believes that it could have cost OpenAI $40 million to process the millions of prompts people fed into the software that month.
Costs skyrocket when these tools are used billions of times a day. Financial analysts estimate Microsoft’s Bing AI chatbot, which is powered by an OpenAI ChatGPT model, needs at least $4 billion of infrastructure to serve responses to all Bing users.
In the case of Latitude, for instance, while the startup didn’t have to pay to train the underlying OpenAI language model it was accessing, it had to account for the inferencing costs that were something akin to “half-a-cent per call” on “a couple million requests per day,” a Latitude spokesperson said.
“And I was being relatively conservative,” Curran said of his calculations.
In order to sow the seeds of the current AI boom, venture capitalists and tech giants have been investing billions of dollars into startups that specialize in generative AI technologies. Microsoft, for instance, invested as much as $10 billion into GPT’s overseer OpenAI, according to media reports in January. Salesforce‘s venture capital arm, Salesforce Ventures, recently debuted a $250 million fund that caters to generative AI startups.
As investor Semil Shah of the VC firms Haystack and Lightspeed Venture Partners described on Twitter, “VC dollars shifted from subsidizing your taxi ride and burrito delivery to LLMs and generative AI compute.”
Many entrepreneurs see risks in relying on potentially subsidized AI models that they don’t control and merely pay for on a per-use basis.
“When I talk to my AI friends at the startup conferences, this is what I tell them: Do not solely depend on OpenAI, ChatGPT or any other large language models,” said Suman Kanuganti, founder of personal.ai, a chatbot currently in beta mode. “Because businesses shift, they are all owned by big tech companies, right? If they cut access, you’re gone.”
Companies such as enterprise tech firm Conversica are exploring how they can use the tech through Microsoft’s Azure cloud service at its currently discounted price.
While Conversica CEO Jim Kaskade declined to comment about how much the startup is paying, he conceded that the subsidized cost is welcome as it explores how language models can be used effectively.
“If they were truly trying to break even, they’d be charging a hell of a lot more,” Kaskade said.
How it could change
It’s unclear if AI computation will stay expensive as the industry develops. Companies making the foundation models, semiconductor makers and startups all see business opportunities in reducing the price of running AI software.
Nvidia, which has about 95% of the market for AI chips, continues to develop more powerful versions designed specifically for machine learning, but improvements in total chip power across the industry have slowed in recent years.
Still, Nvidia CEO Jensen Huang believes that in 10 years, AI will be “a million times” more efficient because of improvements not only in chips, but also in software and other computer parts.
“Moore’s Law, in its best days, would have delivered 100x in a decade,” Huang said last month on an earnings call. “By coming up with new processors, new systems, new interconnects, new frameworks and algorithms, and working with data scientists, AI researchers on new models, across that entire span, we’ve made large language model processing a million times faster.”
Some startups have focused on the high cost of AI as a business opportunity.
“Nobody was saying ‘You should build something that was purpose-built for inference.’ What would that look like?” said Sid Sheth, founder of D-Matrix, a startup building a system to save money on inference by doing more processing in the computer’s memory, as opposed to on a GPU.
“People are using GPUs today, NVIDIA GPUs, to do most of their inference. They buy the DGX systems that NVIDIA sells that cost a ton of money. The problem with inference is if the workload spikes very rapidly, which is what happened to ChatGPT, it went to like a million users in five days. There is no way your GPU capacity can keep up with that because it was not built for that. It was built for training, for graphics acceleration,” he said.
Delangue, the HuggingFace CEO, believes more companies would be better served focusing on smaller, specific models that are cheaper to train and run, instead of the large language models that are garnering most of the attention.
Meanwhile, OpenAI announced last month that it’s lowering the cost for companies to access its GPT models. It now charges one-fifth of one cent for about 750 words of output.
OpenAI’s lower prices have caught the attention of AI Dungeon-maker Latitude.
“I think it’s fair to say that it’s definitely a huge change we’re excited to see happen in the industry and we’re constantly evaluating how we can deliver the best experience to users,” a Latitude spokesperson said. “Latitude is going to continue to evaluate all AI models to be sure we have the best game out there.”
Almost 600 people have signed an open letter to leaders at venture firm Sequoia Capital after one of its partners, Shaun Maguire, posted what the group described as a “deliberate, inflammatory attack” against the Muslim Democratic mayoral candidate in New York City.
Maguire, a vocal supporter of President Donald Trump, posted on X over the weekend that Zohran Mamdani, who won the Democratic primary last month, “comes from a culture that lies about everything” and is out to advance “his Islamist agenda.”
The post had 5.3 million views as of Monday afternoon. Maguire, whose investments include Elon Musk’s SpaceX and X as well as artificial intelligence startup Safe Superintelligence, also published a video on X explaining the remark.
Those signing the letter are asking Sequoia to condemn Maguire’s comments and apologize to Mamdani and Muslim founders. They also want the firm to authorize an independent investigation of Maguire’s behavior in the past two years and post “a zero-tolerance policy on hate speech and religious bigotry.”
They are asking the firm for a public response by July 14, or “we will proceed with broader public disclosure, media outreach and mobilizing our networks to ensure accountability,” the letter says.
Sequoia declined to comment. Maguire didn’t respond to a request for comment, but wrote in a post about the letter on Wednesday that, “You can try everything you want to silence me, but it will just embolden me.”
Among the signees are Mudassir Sheikha, CEO of ride-hailing service Careem, and Amr Awadallah, CEO of AI startup Vectara. Also on the list is Abubakar Abid, who works in machine learning Hugging Face, which is backed by Sequoia, and Ahmed Sabbah, CEO of Telda, a financial technology startup that Sequoia first invested in four years ago.
At least three founders of startups that have gone through startup accelerator program Y Combinator added their names to the letter.
Sequoia as a firm is no stranger to politics. Doug Leone, who led the firm until 2022 and remains a partner, is a longtime Republican donor, who supported Trump in the 2024 election. Following Trump’s victory in November, Leone posted on X, “To all Trump voters: you no longer have to hide in the shadows…..you’re the majority!!”
By contrast, Leone’s predecessor, Mike Moritz, is a Democratic megadonor, who criticized Trump and, in August, slammed his colleagues in the tech industry for lining up behind the Republican nominee. In a Financial Times opinion piece, Moritz wrote Trump’s tech supporters were “making a big mistake.”
“I doubt whether any of them would want him as part of an investment syndicate that they organised,” wrote Moritz, who stepped down from Sequoia in 2023, over a decade after giving up a management role at the firm. “Why then do they dismiss his recent criminal conviction as nothing more than a politically inspired witch-hunt over a simple book-keeping error?”
Neither Leone nor Moritz returned messages seeking comment.
Roelof Botha, Sequoia’s current lead partner, has taken a more neutral stance. Botha said at an event last July that Sequoia as a partnership doesn’t “take a political point of view,” adding that he’s “not a registered member of either party.” Boelof said he’s “proud of the fact that we’ve enabled many of our partners to express their respected individual views along the way, and given them that freedom.”
Maguire has long been open with his political views. He said on X last year that he had “just donated $300k to President Trump.”
Mamdani, a self-described democratic socialist, has gained the ire of many people in tech and in the business community more broadly since defeating former New York Gov. Andrew Cuomo in the June primary.
Samsung signage during the Nvidia GPU Technology Conference (GTC) in San Jose, California, US, on Thursday, March 20, 2025.
David Paul Morris | Bloomberg | Getty Images
South Korea’s Samsung Electronics on Tuesday forecast a 56% fall in profits for the second as the company struggles to capture demand from artificial intelligence chip leader Nvidia.
The memory chip and smartphone maker said in its guidance that operating profit for the quarter ending June was projected to be around 4.6 trillion won, down from 10.44 trillion Korean won year over year.
The figure is a deeper plunge compared to smart estimates from LSEG, which are weighted toward forecasts from analysts who are more consistently accurate.
According to the smart estimates, Samsung was expected to post an operating profit of 6.26 trillion won ($4.57 billion)for the quarter. Meanwhile, Samsung projected its revenue to hit 74 trillion won, falling short of LSEG smart estimates of 75.55 trillion won.
Samsung is a leading player in the global smartphone market and is also one of the world’s largest makers of memory chips, which are utilized in devices such as laptops and servers.
However, the company has been falling behind competitors like SK Hynix and Micron in high-bandwidth memory chips — an advanced type of memory that is being deployed in AI chips.
“The disappointing earnings are due to ongoing operating losses in the foundry business, while the upside in high-margin HBM business remains muted this quarter,” MS Hwang, Research Director at Counterpoint Research, said about the earnings guidance.
SK Hynix, the leader in HBM, has secured a position as Nvidia’s key supplier. While Samsung has reportedly been working to get the latest version of its HBM chips certified by Nvidia, a report from a local outlet suggests these plans have been pushed back to at least September.
The company did not respond to a request for comment on the status of its deals with Nvidia.
Ray Wang, Research Director of Semiconductors, Supply Chain and Emerging Technology at Futurum Group told CNBC that it is clear that Samsung has yet to pass Nvidia’s qualification for its most advanced HBM.
“Given that Nvidia accounts for roughly 70% of global HBM demand, the delay meaningfully caps near-term upside,” Wang said. He noted that while Samsung has secured some HBM supply for AI processors from AMD, this win is unlikely to contribute to second-quarter results due to the timing of production ramps.
Reuters reported in September that Samsung had instructed its subsidiaries worldwide to cut 30% of staff in some divisions, citing sources familiar with the matter.
A Waymo autonomous self-driving Jaguar electric vehicle sits parked at an EVgo charging station in Los Angeles, California, on May 15, 2024.
Patrick T. Fallon | AFP | Getty Images
Waymo said it will begin testing in Philadelphia, with a limited fleet of vehicles and human safety drivers behind the wheel.
“This city is a National Treasure,” Waymo wrote in a post on X on Monday. “It’s a city of love, where eagles fly with a gritty spirit and cheese that spreads and cheese that steaks. Our road trip continues to Philly next.”
The Alphabet-owned company confirmed to CNBC that it will be testing in Pennsylvania’s largest city through the fall, adding that the initial fleet of cars will be manually driven through the more complex parts of Philadelphia, including downtown and on freeways.
“Folks will see our vehicles driving at all hours throughout various neighborhoods, from North Central to Eastwick, and from University City to as far east as the Delaware River,” a Waymo spokesperson said.
With its so-called road trips, Waymo seeks to collect mapping data and evaluate how its autonomous technology, Waymo Driver, performs in new environments, handling traffic patterns and local infrastructure. Road trips are often used a way for the company to gauge whether it can potentially offer a paid ride share service in a particular location.
The expanded testing, which will go through the fall, comes as Waymo aims for a broader rollout. Last month, the company announced plans to drive vehicles manually in New York for testing, marking the first step toward potentially cracking the largest U.S. city. Waymo applied for a permit with the New York City Department of Transportation to operate autonomously with a trained specialist behind the wheel in Manhattan. State law currently doesn’t allow for such driverless operations.
Waymo One provides more than 250,000 paid trips each week across Phoenix, San Francisco, Los Angeles, and Austin, Texas, and is preparing to bring fully autonomous rides to Atlanta, Miami, and Washington, D.C., in 2026.
Alphabet has been under pressure to monetize artificial intelligence products as it bolsters spending on infrastructure. Alphabet’s “Other Bets” segment, which includes Waymo, brought in revenue of $1.65 billion in 2024, up from $1.53 billion in 2023. However, the segment lost $4.44 billion last year, compared to a loss of $4.09 billion the previous year.