OpenAI CEO Sam Altman speaks during a keynote address announcing ChatGPT integration for Bing at Microsoft in Redmond, Washington, on February 7, 2023.
Jason Redmond | AFP | Getty Images
Before OpenAI’s ChatGPT emerged and captured the world’s attention for its ability to create compelling sentences, a small startup called Latitude was wowing consumers with its AI Dungeon game that let them use artificial intelligence to create fantastical tales based on their prompts.
But as AI Dungeon became more popular, Latitude CEO Nick Walton recalled that the cost to maintain the text-based role-playing game began to skyrocket. AI Dungeon’s text-generation software was powered by the GPT language technology offered by the Microsoft-backed AI research lab OpenAI. The more people played AI Dungeon, the bigger the bill Latitude had to pay OpenAI.
Compounding the predicament was that Walton also discovered content marketers were using AI Dungeon to generate promotional copy, a use for AI Dungeon that his team never foresaw, but that ended up adding to the company’s AI bill.
At its peak in 2021, Walton estimates Latitude was spending nearly $200,000 a month on OpenAI’s so-called generative AI software and Amazon Web Services in order to keep up with the millions of user queries it needed to process each day.
“We joked that we had human employees and we had AI employees, and we spent about as much on each of them,” Walton said. “We spent hundreds of thousands of dollars a month on AI and we are not a big startup, so it was a very massive cost.”
By the end of 2021, Latitude switched from using OpenAI’s GPT software to a cheaper but still capable language software offered by startup AI21 Labs, Walton said, adding that the startup also incorporated open source and free language models into its service to lower the cost. Latitude’s generative AI bills have dropped to under $100,000 a month, Walton said, and the startup charges players a monthly subscription for more advanced AI features to help reduce the cost.
Latitude’s pricey AI bills underscore an unpleasant truth behind the recent boom in generative AI technologies: The cost to develop and maintain the software can be extraordinarily high, both for the firms that develop the underlying technologies, generally referred to as a large language or foundation models, and those that use the AI to power their own software.
The high cost of machine learning is an uncomfortable reality in the industry as venture capitalists eye companies that could potentially be worth trillions, and big companies such as Microsoft, Meta, and Google use their considerable capital to develop a lead in the technology that smaller challengers can’t catch up to.
But if the margin for AI applications is permanently smaller than previous software-as-a-service margins, because of the high cost of computing, it could put a damper on the current boom.
The high cost of training and “inference” — actually running — large language models is a structural cost that differs from previous computing booms. Even when the software is built, or trained, it still requires a huge amount of computing power to run large language models because they do billions of calculations every time they return a response to a prompt. By comparison, serving web apps or pages requires much less calculation.
These calculations also require specialized hardware. While traditional computer processors can run machine learning models, they’re slow. Most training and inference now takes place on graphics processors, or GPUs, which were initially intended for 3D gaming, but have become the standard for AI applications because they can do many simple calculations simultaneously.
Nvidia makes most of the GPUs for the AI industry, and its primary data center workhorse chip costs $10,000. Scientists that build these models often joke that they “melt GPUs.”
Training models
Nvidia A100 processor
Nvidia
Analysts and technologists estimate that the critical process of training a large language model such as OpenAI’s GPT-3 could cost more than $4 million. More advanced language models could cost over “the high-single-digit millions” to train, said Rowan Curran, a Forrester analyst who focuses on AI and machine learning.
Meta’s largest LLaMA model released last month, for example, used 2,048 Nvidia A100 GPUs to train on 1.4 trillion tokens (750 words is about 1,000 tokens), taking about 21 days, the company said when it released the model last month.
It took about 1 million GPU hours to train. With dedicated prices from AWS, that would cost over $2.4 million. And at 65 billion parameters, it’s smaller than the current GPT models at OpenAI, like ChatGPT-3, which has 175 billion parameters.
Clement Delangue, the CEO of AI startup Hugging Face, said the process of training the company’s Bloom large language model took more than two-and-a-half months and required access to a supercomputer that was “something like the equivalent of 500 GPUs.”
Organizations that build large language models must be cautious when they retrain the software, which helps improve its abilities, because it costs so much, he said.
“It’s important to realize that these models are not trained all the time, like every day,” Delangue said, noting that’s why some models, such as ChatGPT, don’t have knowledge of recent events. ChatGPT’s knowledge stops in 2021, he said.
“We are actually doing a training right now for the version two of Bloom and it’s gonna cost no more than $10 million to retrain,” Delangue said. “So that’s the kind of thing that we don’t want to do every week.”
Inference and who pays for it
Bing with Chat
Jordan Novet | CNBC
To use a trained machine learning model to make predictions or generate text, engineers use the model in a process called “inference,” which can be much more expensive than training because it might need to run millions of times for a popular product.
For a product as popular as ChatGPT — which investment firm UBS estimates to have reached 100 million monthly active users in January — Curran believes that it could have cost OpenAI $40 million to process the millions of prompts people fed into the software that month.
Costs skyrocket when these tools are used billions of times a day. Financial analysts estimate Microsoft’s Bing AI chatbot, which is powered by an OpenAI ChatGPT model, needs at least $4 billion of infrastructure to serve responses to all Bing users.
In the case of Latitude, for instance, while the startup didn’t have to pay to train the underlying OpenAI language model it was accessing, it had to account for the inferencing costs that were something akin to “half-a-cent per call” on “a couple million requests per day,” a Latitude spokesperson said.
“And I was being relatively conservative,” Curran said of his calculations.
In order to sow the seeds of the current AI boom, venture capitalists and tech giants have been investing billions of dollars into startups that specialize in generative AI technologies. Microsoft, for instance, invested as much as $10 billion into GPT’s overseer OpenAI, according to media reports in January. Salesforce‘s venture capital arm, Salesforce Ventures, recently debuted a $250 million fund that caters to generative AI startups.
As investor Semil Shah of the VC firms Haystack and Lightspeed Venture Partners described on Twitter, “VC dollars shifted from subsidizing your taxi ride and burrito delivery to LLMs and generative AI compute.”
Many entrepreneurs see risks in relying on potentially subsidized AI models that they don’t control and merely pay for on a per-use basis.
“When I talk to my AI friends at the startup conferences, this is what I tell them: Do not solely depend on OpenAI, ChatGPT or any other large language models,” said Suman Kanuganti, founder of personal.ai, a chatbot currently in beta mode. “Because businesses shift, they are all owned by big tech companies, right? If they cut access, you’re gone.”
Companies such as enterprise tech firm Conversica are exploring how they can use the tech through Microsoft’s Azure cloud service at its currently discounted price.
While Conversica CEO Jim Kaskade declined to comment about how much the startup is paying, he conceded that the subsidized cost is welcome as it explores how language models can be used effectively.
“If they were truly trying to break even, they’d be charging a hell of a lot more,” Kaskade said.
How it could change
It’s unclear if AI computation will stay expensive as the industry develops. Companies making the foundation models, semiconductor makers and startups all see business opportunities in reducing the price of running AI software.
Nvidia, which has about 95% of the market for AI chips, continues to develop more powerful versions designed specifically for machine learning, but improvements in total chip power across the industry have slowed in recent years.
Still, Nvidia CEO Jensen Huang believes that in 10 years, AI will be “a million times” more efficient because of improvements not only in chips, but also in software and other computer parts.
“Moore’s Law, in its best days, would have delivered 100x in a decade,” Huang said last month on an earnings call. “By coming up with new processors, new systems, new interconnects, new frameworks and algorithms, and working with data scientists, AI researchers on new models, across that entire span, we’ve made large language model processing a million times faster.”
Some startups have focused on the high cost of AI as a business opportunity.
“Nobody was saying ‘You should build something that was purpose-built for inference.’ What would that look like?” said Sid Sheth, founder of D-Matrix, a startup building a system to save money on inference by doing more processing in the computer’s memory, as opposed to on a GPU.
“People are using GPUs today, NVIDIA GPUs, to do most of their inference. They buy the DGX systems that NVIDIA sells that cost a ton of money. The problem with inference is if the workload spikes very rapidly, which is what happened to ChatGPT, it went to like a million users in five days. There is no way your GPU capacity can keep up with that because it was not built for that. It was built for training, for graphics acceleration,” he said.
Delangue, the HuggingFace CEO, believes more companies would be better served focusing on smaller, specific models that are cheaper to train and run, instead of the large language models that are garnering most of the attention.
Meanwhile, OpenAI announced last month that it’s lowering the cost for companies to access its GPT models. It now charges one-fifth of one cent for about 750 words of output.
OpenAI’s lower prices have caught the attention of AI Dungeon-maker Latitude.
“I think it’s fair to say that it’s definitely a huge change we’re excited to see happen in the industry and we’re constantly evaluating how we can deliver the best experience to users,” a Latitude spokesperson said. “Latitude is going to continue to evaluate all AI models to be sure we have the best game out there.”
Jeff Williams, chief operating officer of Apple Inc., during the Apple Worldwide Developers Conference (WWDC) at Apple Park campus in Cupertino, California, US, on Monday, June 9, 2025.
David Paul Morris | Bloomberg | Getty Images
Apple said on Tuesday that Chief Operating Officer Jeff Williams, a 27-year company veteran, will be retiring later this year.
Current operations leader Sabih Khan will take over much of the COO role later this month, Apple said in a press release. For his remaining time with the comapny, Williams will continue to head up Apple’s design team, Apple Watch, and health initiatives, reporting to CEO Tim Cook.
Williams becomes the latestlongtime Apple executive to step down as key employees, who were active in the company’s hyper-growth years, reach retirement age. Williams, 62, previously headed Apple’s formidable operations division, which is in charge of manufacturing millions of complicated devices like iPhones, while keeping costs down.
He also led important teams inside Apple, including the company’s fabled industrial design team, after longtime leader Jony Ive retired in 2019. When Williams retires, Apple’s design team will report to CEO Tim Cook, Apple said.
“He’s helped to create one of the most respected global supply chains in the world; launched Apple Watch and overseen its development; architected Apple’s health strategy; and led our world class team of designers with great wisdom, heart, and dedication,” Cook said in the statement.
Williams said he plans to spend more time with friends and family.
“June marked my 27th anniversary with Apple, and my 40th in the industry,” Williams said in the release.
Williams is leaving Apple at a time when its famous supply chain is under significant pressure, as the U.S. imposes tariffs on many of the countries where Apple sources its devices, and White House officials publicly pressure Apple to move more production to the U.S.
Khan was added to Apple’s executive team in 2019, taking an executive vice president title. Apple said on Tuesday that he will lead supply chain, product quality, planning, procurement, and fulfillment at Apple.
The operations leader joined Apple’s procurement group in 1995, and before that worked as an engineer and technical leader at GE Plastics. He has a bachelor’s degree from Tufts University and a master’s degree in mechanical engineering from Rensselaer Polytechnic Institute in upstate New York.
Khan has worked closely with Cook. Once, during a meeting when Cook said that a manufacturing problem was “really bad,” Khan stood up and drove to the airport, and immediately booked a flight to China to fix it, according to an anecdote published in Fortune.
Elon Musk, chief executive officer of SpaceX and Tesla, attends the Viva Technology conference at the Porte de Versailles exhibition center in Paris, June 16, 2023.
Gonzalo Fuentes | Reuters
Tesla CEO Elon Musk told Wedbush Securities’ Dan Ives to “Shut up” on Tuesday after the analyst offered three recommendations to the electric vehicle company’s board in a post on X.
Ives has been one of the most bullish Tesla observers on Wall Street. With a $500 price target on the stock, he has the highest projection of any analyst tracked by FactSet.
But on Tuesday, Ives took to X with critical remarks about Musk’s political activity after the world’s richest person said over the weekend that he was creating a new political party called the America Party to challenge Republican candidates who voted for the spending bill that was backed by President Donald Trump.
Ives’ post followed a nearly 7% slide in Tesla’s stock Monday, which wiped out $68 billion in market cap. Ives called for Tesla’s board to create a new pay package for Musk that would get him 25% voting control and clear a path to merge with xAI, establish “guardrails” for how much time Musk has to spend at Tesla, and provide “oversight on political endeavors.”
Ives published a lengthier note with other analysts from his firm headlined, “The Tesla board MUST Act and Create Ground Rules For Musk; Soap Opera Must End.” The analysts said that Musk’s launching of a new political party created a “tipping point in the Tesla story,” necessitating action by the company’s board to rein in the CEO.
Still, Wedbush maintained its price target and its buy recommendation on the stock.
“Shut up, Dan,” Musk wrote in response on X, even though the first suggestion would hand the CEO the voting control he has long sought at Tesla.
In an email to CNBC, Ives wrote, “Elon has his opinion and I get it, but we stand by what the right course of action is for the Board.”
Musk’s historic 2018 CEO pay package, which had been worth around $56 billion and has since gone up in value, was voided last year by the Delaware Court of Chancery. Judge Kathaleen McCormick ruled that Tesla’s board members had lacked independence from Musk and failed to properly negotiate at arm’s length with the CEO.
Tesla has appealed that case to the Delaware state Supreme Court and is trying to determine what Musk’s next pay package should entail.
Ives isn’t the only Tesla bull to criticize Musk’s continued political activism.
Analysts at William Blair downgraded the stock to the equivalent of a hold from a buy on Monday, because of Musk’s political plans and rhetoric as well as the negative impacts that the spending bill passed by Congress could have on Tesla’s margins and EV sales.
“We expect that investors are growing tired of the distraction at a point when the business needs Musk’s attention the most and only see downside from his dip back into politics,” the analysts wrote. “We would prefer this effort to be channeled towards the robotaxi rollout at this critical juncture.”
Trump supporter James Fishback, CEO of hedge fund Azoria Partners, said Saturday that his firm postponed the listing of an exchange-traded fund, the Azoria Tesla Convexity ETF, that would invest in the EV company’s shares and options. He began his post on X saying, “Elon has gone too far.”
“I encourage the Board to meet immediately and ask Elon to clarify his political ambitions and evaluate whether they are compatible with his full-time obligations to Tesla as CEO,” Fishback wrote.
Musk said Saturday that he has formed the America Party, which he claimed will give Americans “back your freedom.” He hasn’t shared formal details, including where the party may be registered, how much funding he will provide for it and which candidates he will back.
Tesla’s stock is now down about 25% this year, badly underperforming U.S. indexes and by far the worst performance among tech’s megacaps.
Musk spent much of the first half of the year working with the Trump administration and leading an effort to massively downsize the federal government. His official work with the administration wrapped up at the end of May, and his exit preceded a public spat between Musk and Trump over the spending bill and other matters.
Musk, Tesla’s board chair Robyn Denholm and investor relations representative Travis Axelrod didn’t immediately respond to requests for comment.
Waymo announced it is now offering teen accounts for its self-driving car service Waymo One, beginning in Phoenix, Arizona.
Courtesy of Waymo
Waymo announced Tuesday that it is offering accounts for teens ages 14 to 17, starting in Phoenix.
The Alphabet-owned company said that, beginning Tuesday, parents in Phoenix can use their Waymo accounts “to invite their teen into the program, pairing them together.” Once their account is activated, teens can hail fully autonomous rides.
Previously, users were required to be at least 18 years old to sign up for a Waymo account, but the age range expansion comes as the company seeks to increase ridership amid a broader expansion of its ride-hailing service across U.S. cities. Alphabet has also been under pressure to monetize AI products amid increased competition and economic headwinds.
Waymo said it will offer “specially-trained Rider Support agents” during rides hailed by teens and loop in parents if needed. Teens can also share their trip status with their parents for real-time updates on their progress, and parents receive all ride receipts.
Teen accounts are initially only being offered to riders in the metro Phoenix area. Teen accounts will expand to more markets outside California where the Waymo app is available in the future, a spokesperson said.
Waymo’s expansion to teens follows a similar move by Uber, which launched teen accounts in 2023. Waymo, which has partnerships with Uber in multiple markets, said it “may consider enabling access for teens through our network partners in the future.”
Already, Waymo provides more than 250,000 paid trips each week across Phoenix, the San Francisco Bay Area, Los Angeles, Atlanta, and Austin, Texas, and the company is preparing to bring autonomous rides to Miami and Washington, D.C., in 2026.
In June, Waymo announced that it plans to manually drive vehicles in New York, marking the first step toward potentially cracking the largest U.S. city. Waymo said it applied for a permit with the New York City Department of Transportation to operate autonomously with a trained specialist behind the wheel in Manhattan.