Connect with us

Published

on

OpenAI CEO Sam Altman speaks during a keynote address announcing ChatGPT integration for Bing at Microsoft in Redmond, Washington, on February 7, 2023.

Jason Redmond | AFP | Getty Images

Before OpenAI’s ChatGPT emerged and captured the world’s attention for its ability to create compelling sentences, a small startup called Latitude was wowing consumers with its AI Dungeon game that let them use artificial intelligence to create fantastical tales based on their prompts.

But as AI Dungeon became more popular, Latitude CEO Nick Walton recalled that the cost to maintain the text-based role-playing game began to skyrocket. AI Dungeon’s text-generation software was powered by the GPT language technology offered by the Microsoft-backed AI research lab OpenAI. The more people played AI Dungeon, the bigger the bill Latitude had to pay OpenAI.

Compounding the predicament was that Walton also discovered content marketers were using AI Dungeon to generate promotional copy, a use for AI Dungeon that his team never foresaw, but that ended up adding to the company’s AI bill.

At its peak in 2021, Walton estimates Latitude was spending nearly $200,000 a month on OpenAI’s so-called generative AI software and Amazon Web Services in order to keep up with the millions of user queries it needed to process each day.

“We joked that we had human employees and we had AI employees, and we spent about as much on each of them,” Walton said. “We spent hundreds of thousands of dollars a month on AI and we are not a big startup, so it was a very massive cost.”

By the end of 2021, Latitude switched from using OpenAI’s GPT software to a cheaper but still capable language software offered by startup AI21 Labs, Walton said, adding that the startup also incorporated open source and free language models into its service to lower the cost. Latitude’s generative AI bills have dropped to under $100,000 a month, Walton said, and the startup charges players a monthly subscription for more advanced AI features to help reduce the cost.

Latitude’s pricey AI bills underscore an unpleasant truth behind the recent boom in generative AI technologies: The cost to develop and maintain the software can be extraordinarily high, both for the firms that develop the underlying technologies, generally referred to as a large language or foundation models, and those that use the AI to power their own software.

The high cost of machine learning is an uncomfortable reality in the industry as venture capitalists eye companies that could potentially be worth trillions, and big companies such as Microsoft, Meta, and Google use their considerable capital to develop a lead in the technology that smaller challengers can’t catch up to. 

But if the margin for AI applications is permanently smaller than previous software-as-a-service margins, because of the high cost of computing, it could put a damper on the current boom. 

The high cost of training and “inference” — actually running — large language models is a structural cost that differs from previous computing booms. Even when the software is built, or trained, it still requires a huge amount of computing power to run large language models because they do billions of calculations every time they return a response to a prompt. By comparison, serving web apps or pages requires much less calculation.

These calculations also require specialized hardware. While traditional computer processors can run machine learning models, they’re slow. Most training and inference now takes place on graphics processors, or GPUs, which were initially intended for 3D gaming, but have become the standard for AI applications because they can do many simple calculations simultaneously. 

Nvidia makes most of the GPUs for the AI industry, and its primary data center workhorse chip costs $10,000. Scientists that build these models often joke that they “melt GPUs.”

Training models

Nvidia A100 processor

Nvidia

Analysts and technologists estimate that the critical process of training a large language model such as OpenAI’s GPT-3 could cost more than $4 million. More advanced language models could cost over “the high-single-digit millions” to train, said Rowan Curran, a Forrester analyst who focuses on AI and machine learning.

Meta’s largest LLaMA model released last month, for example, used 2,048 Nvidia A100 GPUs to train on 1.4 trillion tokens (750 words is about 1,000 tokens), taking about 21 days, the company said when it released the model last month. 

It took about 1 million GPU hours to train. With dedicated prices from AWS, that would cost over $2.4 million. And at 65 billion parameters, it’s smaller than the current GPT models at OpenAI, like ChatGPT-3, which has 175 billion parameters. 

Clement Delangue, the CEO of AI startup Hugging Face, said the process of training the company’s Bloom large language model took more than two-and-a-half months and required access to a supercomputer that was “something like the equivalent of 500 GPUs.”

Organizations that build large language models must be cautious when they retrain the software, which helps improve its abilities, because it costs so much, he said.

“It’s important to realize that these models are not trained all the time, like every day,” Delangue said, noting that’s why some models, such as ChatGPT, don’t have knowledge of recent events. ChatGPT’s knowledge stops in 2021, he said.

“We are actually doing a training right now for the version two of Bloom and it’s gonna cost no more than $10 million to retrain,” Delangue said. “So that’s the kind of thing that we don’t want to do every week.”

Inference and who pays for it

Bing with Chat

Jordan Novet | CNBC

To use a trained machine learning model to make predictions or generate text, engineers use the model in a process called “inference,” which can be much more expensive than training because it might need to run millions of times for a popular product.

For a product as popular as ChatGPT — which investment firm UBS estimates to have reached 100 million monthly active users in January — Curran believes that it could have cost OpenAI $40 million to process the millions of prompts people fed into the software that month.

Costs skyrocket when these tools are used billions of times a day. Financial analysts estimate Microsoft’s Bing AI chatbot, which is powered by an OpenAI ChatGPT model, needs at least $4 billion of infrastructure to serve responses to all Bing users.

In the case of Latitude, for instance, while the startup didn’t have to pay to train the underlying OpenAI language model it was accessing, it had to account for the inferencing costs that were something akin to “half-a-cent per call” on “a couple million requests per day,” a Latitude spokesperson said.

“And I was being relatively conservative,” Curran said of his calculations.

In order to sow the seeds of the current AI boom, venture capitalists and tech giants have been investing billions of dollars into startups that specialize in generative AI technologies. Microsoft, for instance, invested as much as $10 billion into GPT’s overseer OpenAI, according to media reports in January. Salesforce‘s venture capital arm, Salesforce Ventures, recently debuted a $250 million fund that caters to generative AI startups.

As investor Semil Shah of the VC firms Haystack and Lightspeed Venture Partners described on Twitter, “VC dollars shifted from subsidizing your taxi ride and burrito delivery to LLMs and generative AI compute.”

Many entrepreneurs see risks in relying on potentially subsidized AI models that they don’t control and merely pay for on a per-use basis.

“When I talk to my AI friends at the startup conferences, this is what I tell them: Do not solely depend on OpenAI, ChatGPT or any other large language models,” said Suman Kanuganti, founder of personal.ai, a chatbot currently in beta mode. “Because businesses shift, they are all owned by big tech companies, right? If they cut access, you’re gone.”

Companies such as enterprise tech firm Conversica are exploring how they can use the tech through Microsoft’s Azure cloud service at its currently discounted price.

While Conversica CEO Jim Kaskade declined to comment about how much the startup is paying, he conceded that the subsidized cost is welcome as it explores how language models can be used effectively.

“If they were truly trying to break even, they’d be charging a hell of a lot more,” Kaskade said.

How it could change

Nvidia expanded from gaming into A.I. Now the big bet is paying off as its chips power ChatGPT

It’s unclear if AI computation will stay expensive as the industry develops. Companies making the foundation models, semiconductor makers and startups all see business opportunities in reducing the price of running AI software.

Nvidia, which has about 95% of the market for AI chips, continues to develop more powerful versions designed specifically for machine learning, but improvements in total chip power across the industry have slowed in recent years.

Still, Nvidia CEO Jensen Huang believes that in 10 years, AI will be “a million times” more efficient because of improvements not only in chips, but also in software and other computer parts.

“Moore’s Law, in its best days, would have delivered 100x in a decade,” Huang said last month on an earnings call. “By coming up with new processors, new systems, new interconnects, new frameworks and algorithms, and working with data scientists, AI researchers on new models, across that entire span, we’ve made large language model processing a million times faster.”

Some startups have focused on the high cost of AI as a business opportunity.

“Nobody was saying ‘You should build something that was purpose-built for inference.’ What would that look like?” said Sid Sheth, founder of D-Matrix, a startup building a system to save money on inference by doing more processing in the computer’s memory, as opposed to on a GPU.

“People are using GPUs today, NVIDIA GPUs, to do most of their inference. They buy the DGX systems that NVIDIA sells that cost a ton of money. The problem with inference is if the workload spikes very rapidly, which is what happened to ChatGPT, it went to like a million users in five days. There is no way your GPU capacity can keep up with that because it was not built for that. It was built for training, for graphics acceleration,” he said.

Delangue, the HuggingFace CEO, believes more companies would be better served focusing on smaller, specific models that are cheaper to train and run, instead of the large language models that are garnering most of the attention.

Meanwhile, OpenAI announced last month that it’s lowering the cost for companies to access its GPT models. It now charges one-fifth of one cent for about 750 words of output.

OpenAI’s lower prices have caught the attention of AI Dungeon-maker Latitude.

“I think it’s fair to say that it’s definitely a huge change we’re excited to see happen in the industry and we’re constantly evaluating how we can deliver the best experience to users,” a Latitude spokesperson said. “Latitude is going to continue to evaluate all AI models to be sure we have the best game out there.”

Watch: AI’s “iPhone Moment” – Separating ChatGPT Hype and Reality

AI's "iPhone Moment" – Separating ChatGPT Hype and Reality

Continue Reading

Technology

Microsoft AI chief Suleyman sees advantage in building models ‘3 or 6 months behind’

Published

on

By

Microsoft AI chief Suleyman sees advantage in building models ‘3 or 6 months behind’

Microsoft owns lots of Nvidia graphics processing units, but it isn’t using them to develop state-of-the-art artificial intelligence models.

There are good reasons for that position, Mustafa Suleyman, the company’s CEO of AI, told CNBC’s Steve Kovach in an interview on Friday. Waiting to build models that are “three or six months behind” offers several advantages, including lower costs and the ability to concentrate on specific use cases, Suleyman said.

It’s “cheaper to give a specific answer once you’ve waited for the first three or six months for the frontier to go first. We call that off-frontier,” he said. “That’s actually our strategy, is to really play a very tight second, given the capital-intensiveness of these models.”

Suleyman made a name for himself as a co-founder of DeepMind, the AI lab that Google bought in 2014, reportedly for $400 million to $650 million. Suleyman arrived at Microsoft last year alongside other employees of the startup Inflection, where he had been CEO.

More than ever, Microsoft counts on relationships with other companies to grow.

It gets AI models from San Francisco startup OpenAI and supplemental computing power from newly public CoreWeave in New Jersey. Microsoft has repeatedly enriched Bing, Windows and other products with OpenAI’s latest systems for writing human-like language and generating images.

Microsoft’s Copilot will gain “memory” to retain key facts about people who repeatedly use the assistant, Suleyman said Friday at an event in Microsoft’s Redmond, Washington, headquarters to commemorate the company’s 50th birthday. That feature came first to OpenAI’s ChatGPT, which has 500 million weekly users.

Through ChatGPT, people can access top-flight large language models such as the o1 reasoning model that takes time before spitting out an answer. OpenAI introduced that capability in September — only weeks later did Microsoft bring a similar capability called Think Deeper to Copilot.

Microsoft occasionally releases open-source small-language models that can run on PCs. They don’t require powerful server GPUs, making them different from OpenAI’s o1.

OpenAI and Microsoft have held a tight relationship shortly after the startup launched its ChatGPT chatbot in late 2022, effectively kicking off the generative AI race. In total, Microsoft has invested $13.75 billion in the startup, but more recently, fissures in the relationship between the two companies have begun to show.

Microsoft added OpenAI to its list of competitors in July 2024, and OpenAI in January announced that it was working with rival cloud provider Oracle on the $500 billion Stargate project. That came after years of OpenAI exclusively relying on Microsoft’s Azure cloud. Despite OpenAI partnering with Oracle, Microsoft in a blog post announced that the startup had “recently made a new, large Azure commitment.”

“Look, it’s absolutely mission-critical that long-term, we are able to do AI self-sufficiently at Microsoft,” Suleyman said. “At the same time, I think about these things over five and 10 year periods. You know, until 2030 at least, we are deeply partnered with OpenAI, who have [had an] enormously successful relationship for us.

Microsoft is focused on building its own AI internally, but the company is not pushing itself to build the most cutting-edge models, Suleyman said.

“We have an incredibly strong AI team, huge amounts of compute, and it’s very important to us that, you know, maybe we don’t develop the absolute frontier, the best model in the world first,” he said. “That’s very, very expensive to do and unnecessary to cause that duplication.”

WATCH: Microsoft Copilot beginning of a seismic shift in AI integration, says Microsoft AI CEO Suleyman

Continue Reading

Technology

Former Microsoft CEO Steve Ballmer says, as shareholder, tariffs are ‘not good’

Published

on

By

Former Microsoft CEO Steve Ballmer says, as shareholder, tariffs are 'not good'

President Trump’s new tariffs on goods that the U.S. imports from over 100 countries will have an effect on consumers, former Microsoft CEO Steve Ballmer told CNBC on Friday. Investors will feel the pain, too.

Microsoft’s stock dropped almost 6% in the past two days, as the Nasdaq wrapped up its worst week in five years.

“As a Microsoft shareholder, this kind of thing is not good,” Ballmer said, in an interview with Andrew Ross Sorkin that was tied to Microsoft’s 50th anniversary celebration. “It creates opportunity to be a serious, long-term player.”

Ballmer was sandwiched in between Microsoft co-founder Bill Gates and current CEO Satya Nadella for the interview.

“I took just enough economics in college — that tariffs are actually going to bring some turmoil,” said Ballmer, who was succeeded by Nadella in 2014. Gates, Microsoft’s first CEO, convinced Ballmer to join the company in 1980.

Gates, Ballmer and Nadella attended proceedings at Microsoft’s Redmond, Washington, campus on Friday to celebrate its first half-century.

Between the tariffs and weak quarterly revenue guidance announced in January, Microsoft’s stock is on track for its fifth straight month of declines, which would be the worst stretch since 2009. But the company remains a leader in the PC operating system and productivity software markets, and its partnership with startup OpenAI has led to gains in cloud computing.

“I think that disruption is very hard on people, and so the decision to do something for which disruption was inevitable, that needs a lot of popular support, and nobody could game theorize exactly who is going to do what in response,” Ballmer said, regarding the tariffs. “So, I think citizens really like stability a lot. And I hope people — individuals who will feel this, because people are feeling it, not just the stock market, people are going to feel it.”

Ballmer, who owns the Los Angeles Clippers, is among Microsoft’s biggest fans. He said he’s the company’s largest investor. In 2014, shortly after he bought the basketball team for $2 billion, he held over 333 million shares of the stock, according to a regulatory filing.

“I’m not going to probably have 50 more years on the planet,” he said. “But whatever minutes I have, I’m gonna be a large Microsoft shareholder.” He said there’s a bright future for computing, storage and intelligence. Microsoft launched the first Azure services while Ballmer was CEO.

Earlier this week Bloomberg reported that Microsoft, which pledged to spend $80 billion on AI-enabled data center infrastructure in the current fiscal year, has stopped discussions or pushed back the opening of facilities in the U.S. and abroad.

JPMorgan Chase’s chief economist, Bruce Kasman, said in a Thursday note that the chance of a global recession will be 60% if Trump’s tariffs kick in as described. His previous estimate was 40%.

“Fifty years from now, or 25 years from now, what is the one thing you can be guaranteed of, is the world needs more compute,” Nadella said. “So I want to keep those two thoughts and then take one step at a time, and then whatever are the geopolitical or economic shifts, we’ll adjust to it.”

Gates, who along with co-founder Paul Allen, sought to build a software company rather than sell both software and hardware, said he wasn’t sure what the economic effects of the tariffs will be. Today, most of Microsoft’s revenue comes from software. It also sells Surface PCs and Xbox consoles.

“So far, it’s just on goods, but you know, will it eventually be on services? Who knows?” said Gates, who reportedly donated around $50 million to a nonprofit that supported Democratic nominee Kamala Harris’ losing campaign.

— CNBC’s Alex Harring contributed to this report.

WATCH: There will be many LLM winners, says infrastructure investor Morrison

Continue Reading

Technology

AppLovin can offer TikTok ‘much stronger bid than others,’ CEO says

Published

on

By

AppLovin can offer TikTok 'much stronger bid than others,' CEO says

Piotr Swat | Lightrocket | Getty Images

AppLovin CEO Adam Foroughi provided more clarity on the ad-tech company’s late-stage effort to acquire TikTok, calling his offer a “much stronger bid than others” on CNBC’s The Exchange Friday afternoon.

Foroughi said the company is proposing a merger between AppLovin and the entire global business of TikTok, characterizing the deal as a “partnership” where the Chinese could participate in the upside while AppLovin would run the app.

“If you pair our algorithm with the TikTok audience, the expansion on that platform for dollars spent will be through the roof,” Foroughi said.

The news comes as President Trump announced he would extend the deadline a second time for TikTok’s Chinese-owned parent company ByteDance to sell the U.S. subsidiary of TikTok to an American buyer or face an effective ban on U.S. app stores. The new deadline is now in June, which, as Foroughi described, “buys more time to put the pieces together” on AppLovin’s bid. 

“The president’s a great dealmaker — we’re proposing, essentially an enhancement to the deal that they’ve been working on, but a bigger version of all the deals contemplated,” he added.

AppLovin faces a crowded field of other interested U.S. backers, including Amazon, Oracle, billionaire Frank McCourt and his Project Liberty consortium, and numerous private equity firms. Some proposals reportedly structure the deal to give a U.S. buyer 50% ownership of the company, rather than a complete acquisition. The Chinese government will still need to approve the deal, and AppLovin’s interest in purchasing TikTok in “all markets outside of China” is “preliminary,” according to an April 3 SEC filing.

Correction: A prior version of this story incorrectly characterized China’s ongoing role in TikTok should AppLovin acquire the app.

WATCH: AppLovin CEO Adam Foroughi on its bid to buy TikTok

AppLovin CEO Adam Foroughi on its bid to buy TikTok

Continue Reading

Trending