Connect with us

Published

on

OpenAI CEO Sam Altman speaks during a keynote address announcing ChatGPT integration for Bing at Microsoft in Redmond, Washington, on February 7, 2023.

Jason Redmond | AFP | Getty Images

Before OpenAI’s ChatGPT emerged and captured the world’s attention for its ability to create compelling sentences, a small startup called Latitude was wowing consumers with its AI Dungeon game that let them use artificial intelligence to create fantastical tales based on their prompts.

But as AI Dungeon became more popular, Latitude CEO Nick Walton recalled that the cost to maintain the text-based role-playing game began to skyrocket. AI Dungeon’s text-generation software was powered by the GPT language technology offered by the Microsoft-backed AI research lab OpenAI. The more people played AI Dungeon, the bigger the bill Latitude had to pay OpenAI.

Compounding the predicament was that Walton also discovered content marketers were using AI Dungeon to generate promotional copy, a use for AI Dungeon that his team never foresaw, but that ended up adding to the company’s AI bill.

At its peak in 2021, Walton estimates Latitude was spending nearly $200,000 a month on OpenAI’s so-called generative AI software and Amazon Web Services in order to keep up with the millions of user queries it needed to process each day.

“We joked that we had human employees and we had AI employees, and we spent about as much on each of them,” Walton said. “We spent hundreds of thousands of dollars a month on AI and we are not a big startup, so it was a very massive cost.”

By the end of 2021, Latitude switched from using OpenAI’s GPT software to a cheaper but still capable language software offered by startup AI21 Labs, Walton said, adding that the startup also incorporated open source and free language models into its service to lower the cost. Latitude’s generative AI bills have dropped to under $100,000 a month, Walton said, and the startup charges players a monthly subscription for more advanced AI features to help reduce the cost.

Latitude’s pricey AI bills underscore an unpleasant truth behind the recent boom in generative AI technologies: The cost to develop and maintain the software can be extraordinarily high, both for the firms that develop the underlying technologies, generally referred to as a large language or foundation models, and those that use the AI to power their own software.

The high cost of machine learning is an uncomfortable reality in the industry as venture capitalists eye companies that could potentially be worth trillions, and big companies such as Microsoft, Meta, and Google use their considerable capital to develop a lead in the technology that smaller challengers can’t catch up to. 

But if the margin for AI applications is permanently smaller than previous software-as-a-service margins, because of the high cost of computing, it could put a damper on the current boom. 

The high cost of training and “inference” — actually running — large language models is a structural cost that differs from previous computing booms. Even when the software is built, or trained, it still requires a huge amount of computing power to run large language models because they do billions of calculations every time they return a response to a prompt. By comparison, serving web apps or pages requires much less calculation.

These calculations also require specialized hardware. While traditional computer processors can run machine learning models, they’re slow. Most training and inference now takes place on graphics processors, or GPUs, which were initially intended for 3D gaming, but have become the standard for AI applications because they can do many simple calculations simultaneously. 

Nvidia makes most of the GPUs for the AI industry, and its primary data center workhorse chip costs $10,000. Scientists that build these models often joke that they “melt GPUs.”

Training models

Nvidia A100 processor

Nvidia

Analysts and technologists estimate that the critical process of training a large language model such as OpenAI’s GPT-3 could cost more than $4 million. More advanced language models could cost over “the high-single-digit millions” to train, said Rowan Curran, a Forrester analyst who focuses on AI and machine learning.

Meta’s largest LLaMA model released last month, for example, used 2,048 Nvidia A100 GPUs to train on 1.4 trillion tokens (750 words is about 1,000 tokens), taking about 21 days, the company said when it released the model last month. 

It took about 1 million GPU hours to train. With dedicated prices from AWS, that would cost over $2.4 million. And at 65 billion parameters, it’s smaller than the current GPT models at OpenAI, like ChatGPT-3, which has 175 billion parameters. 

Clement Delangue, the CEO of AI startup Hugging Face, said the process of training the company’s Bloom large language model took more than two-and-a-half months and required access to a supercomputer that was “something like the equivalent of 500 GPUs.”

Organizations that build large language models must be cautious when they retrain the software, which helps improve its abilities, because it costs so much, he said.

“It’s important to realize that these models are not trained all the time, like every day,” Delangue said, noting that’s why some models, such as ChatGPT, don’t have knowledge of recent events. ChatGPT’s knowledge stops in 2021, he said.

“We are actually doing a training right now for the version two of Bloom and it’s gonna cost no more than $10 million to retrain,” Delangue said. “So that’s the kind of thing that we don’t want to do every week.”

Inference and who pays for it

Bing with Chat

Jordan Novet | CNBC

To use a trained machine learning model to make predictions or generate text, engineers use the model in a process called “inference,” which can be much more expensive than training because it might need to run millions of times for a popular product.

For a product as popular as ChatGPT — which investment firm UBS estimates to have reached 100 million monthly active users in January — Curran believes that it could have cost OpenAI $40 million to process the millions of prompts people fed into the software that month.

Costs skyrocket when these tools are used billions of times a day. Financial analysts estimate Microsoft’s Bing AI chatbot, which is powered by an OpenAI ChatGPT model, needs at least $4 billion of infrastructure to serve responses to all Bing users.

In the case of Latitude, for instance, while the startup didn’t have to pay to train the underlying OpenAI language model it was accessing, it had to account for the inferencing costs that were something akin to “half-a-cent per call” on “a couple million requests per day,” a Latitude spokesperson said.

“And I was being relatively conservative,” Curran said of his calculations.

In order to sow the seeds of the current AI boom, venture capitalists and tech giants have been investing billions of dollars into startups that specialize in generative AI technologies. Microsoft, for instance, invested as much as $10 billion into GPT’s overseer OpenAI, according to media reports in January. Salesforce‘s venture capital arm, Salesforce Ventures, recently debuted a $250 million fund that caters to generative AI startups.

As investor Semil Shah of the VC firms Haystack and Lightspeed Venture Partners described on Twitter, “VC dollars shifted from subsidizing your taxi ride and burrito delivery to LLMs and generative AI compute.”

Many entrepreneurs see risks in relying on potentially subsidized AI models that they don’t control and merely pay for on a per-use basis.

“When I talk to my AI friends at the startup conferences, this is what I tell them: Do not solely depend on OpenAI, ChatGPT or any other large language models,” said Suman Kanuganti, founder of personal.ai, a chatbot currently in beta mode. “Because businesses shift, they are all owned by big tech companies, right? If they cut access, you’re gone.”

Companies such as enterprise tech firm Conversica are exploring how they can use the tech through Microsoft’s Azure cloud service at its currently discounted price.

While Conversica CEO Jim Kaskade declined to comment about how much the startup is paying, he conceded that the subsidized cost is welcome as it explores how language models can be used effectively.

“If they were truly trying to break even, they’d be charging a hell of a lot more,” Kaskade said.

How it could change

Nvidia expanded from gaming into A.I. Now the big bet is paying off as its chips power ChatGPT

It’s unclear if AI computation will stay expensive as the industry develops. Companies making the foundation models, semiconductor makers and startups all see business opportunities in reducing the price of running AI software.

Nvidia, which has about 95% of the market for AI chips, continues to develop more powerful versions designed specifically for machine learning, but improvements in total chip power across the industry have slowed in recent years.

Still, Nvidia CEO Jensen Huang believes that in 10 years, AI will be “a million times” more efficient because of improvements not only in chips, but also in software and other computer parts.

“Moore’s Law, in its best days, would have delivered 100x in a decade,” Huang said last month on an earnings call. “By coming up with new processors, new systems, new interconnects, new frameworks and algorithms, and working with data scientists, AI researchers on new models, across that entire span, we’ve made large language model processing a million times faster.”

Some startups have focused on the high cost of AI as a business opportunity.

“Nobody was saying ‘You should build something that was purpose-built for inference.’ What would that look like?” said Sid Sheth, founder of D-Matrix, a startup building a system to save money on inference by doing more processing in the computer’s memory, as opposed to on a GPU.

“People are using GPUs today, NVIDIA GPUs, to do most of their inference. They buy the DGX systems that NVIDIA sells that cost a ton of money. The problem with inference is if the workload spikes very rapidly, which is what happened to ChatGPT, it went to like a million users in five days. There is no way your GPU capacity can keep up with that because it was not built for that. It was built for training, for graphics acceleration,” he said.

Delangue, the HuggingFace CEO, believes more companies would be better served focusing on smaller, specific models that are cheaper to train and run, instead of the large language models that are garnering most of the attention.

Meanwhile, OpenAI announced last month that it’s lowering the cost for companies to access its GPT models. It now charges one-fifth of one cent for about 750 words of output.

OpenAI’s lower prices have caught the attention of AI Dungeon-maker Latitude.

“I think it’s fair to say that it’s definitely a huge change we’re excited to see happen in the industry and we’re constantly evaluating how we can deliver the best experience to users,” a Latitude spokesperson said. “Latitude is going to continue to evaluate all AI models to be sure we have the best game out there.”

Watch: AI’s “iPhone Moment” – Separating ChatGPT Hype and Reality

AI's "iPhone Moment" – Separating ChatGPT Hype and Reality

Continue Reading

Technology

How TikTok’s rise sparked a short-form video race

Published

on

By

How TikTok’s rise sparked a short-form video race

TikTok’s grip on the short-form video market is tightening, and the world’s biggest tech platforms are racing to catch up.

Since launching globally in 2016, ByteDance-owned TikTok has amassed over 1.12 billion monthly active users worldwide, according to Backlinko. American users spend an average of 108 minutes per day on the app, according to Apptoptia.

TikTok’s success has reshaped the social media landscape, forcing competitors like Meta and Google to pivot their strategies around short-form video. But so far, experts say that none have matched TikTok’s algorithmic precision.

“It is the center of the internet for young people,” said Jasmine Enberg, vice president and principal analyst at Emarketer. “It’s where they go for entertainment, news, trends, even shopping. TikTok sets the tone for everyone else.”

Platforms like Meta‘s Instagram Reels and Google’s YouTube Shorts have expanded aggressively, launching new features, creator tools and even considering separate apps just to compete. Microsoft-owned LinkedIn, traditionally a professional networking site, is the latest to experiment with TikTok-style feeds. But with TikTok continuing to evolve, adding features like e-commerce integrations and longer videos, the question remains whether rivals can keep up.

“I’m scrolling every single day. I doom scroll all the time,” said TikTok content creator Alyssa McKay.

But there may a dark side to this growth.

As short-form content consumption soars, experts warn about shrinking attention spans and rising mental-health concerns, particularly among younger users. Researchers like Dr. Yann Poncin, associate professor at the Child Study Center at Yale University, point to disrupted sleep patterns and increased anxiety levels tied to endless scrolling habits.

“Infinite scrolling and short-form video are designed to capture your attention in short bursts,” Dr. Poncin said. “In the past, entertainment was about taking you on a journey through a show or story. Now, it’s about locking you in for just a few seconds, just enough to feed you the next thing the algorithm knows you’ll like.”

Despite sky-high engagement, monetizing short videos remains an uphill battle. Unlike long-form YouTube content, where ads can be inserted throughout, short clips offer limited space for advertisers. Creators, too, are feeling the squeeze.

“It’s never been easier to go viral,” said Enberg. “But it’s never been harder to turn that virality into a sustainable business.”

Last year, TikTok generated an estimated $23.6 billion in ad revenues, according to Oberlo, but even with this growth, many creators still make just a few dollars per million views. YouTube Shorts pays roughly four cents per 1,000 views, which is less than its long-form counterpart. Meanwhile, Instagram has leaned into brand partnerships and emerging tools like “Trial Reels,” which allow creators to experiment with content by initially sharing videos only with non-followers, giving them a low-risk way to test new formats or ideas before deciding whether to share with their full audience. But Meta told CNBC that monetizing Reels remains a work in progress.

While lawmakers scrutinize TikTok’s Chinese ownership and explore potential bans, competitors see a window of opportunity. Meta and YouTube are poised to capture up to 50% of reallocated ad dollars if TikTok faces restrictions in the U.S., according to eMarketer.

Watch the video to understand how TikTok’s rise sparked a short form video race.

Continue Reading

Technology

Elon Musk’s xAI Holdings in talks to raise $20 billion, Bloomberg News reports

Published

on

By

Elon Musk's xAI Holdings in talks to raise  billion, Bloomberg News reports

The X logo appears on a phone, and the xAI logo is displayed on a laptop in Krakow, Poland, on April 1, 2025. (Photo by Klaudia Radecka/NurPhoto via Getty Images)

Nurphoto | Nurphoto | Getty Images

Elon Musk‘s xAI Holdings is in discussions with investors to raise about $20 billion, Bloomberg News reported Friday, citing people familiar with the matter.

The funding would value the company at over $120 billion, according to the report.

Musk was looking to assign “proper value” to xAI, sources told CNBC’s David Faber earlier this month. The remarks were made during a call with xAI investors, sources familiar with the matter told Faber. The Tesla CEO at that time didn’t explicitly mention any upcoming funding round, but the sources suggested xAI was preparing for a substantial capital raise in the near future.

The funding amount could be more than $20 billion as the exact figure had not been decided, the Bloomberg report added.

Artificial intelligence startup xAI didn’t immediately respond to a CNBC request for comment outside of U.S. business hours.

Faber Report: Elon Musk held call with current xAI investors, sources say

The AI firm last month acquired X in an all-stock deal that valued xAI at $80 billion and the social media platform at $33 billion.

“xAI and X’s futures are intertwined. Today, we officially take the step to combine the data, models, compute, distribution and talent,” Musk said on X, announcing the deal. “This combination will unlock immense potential by blending xAI’s advanced AI capability and expertise with X’s massive reach.”

Read the full Bloomberg story here.

— CNBC’s Samantha Subin contributed to this report.

Continue Reading

Technology

Alphabet jumps 3% as search, advertising units show resilient growth

Published

on

By

Alphabet jumps 3% as search, advertising units show resilient growth

Alphabet CEO Sundar Pichai during the Google I/O developers conference in Mountain View, California, on May 10, 2023.

David Paul Morris | Bloomberg | Getty Images

Alphabet‘s stock gained 3% Friday after signaling strong growth in its search and advertising businesses amid a competitive artificial intelligence environment and uncertain macro backdrop.

GOOGL‘s pace of GenAI product roll-out is accelerating with multiple encouraging signals,” wrote Morgan Stanley‘s Brian Nowak. “Macro uncertainty still exists but we remain [overweight] given GOOGL’s still strong relative position and improving pace of GenAI enabled product roll-out.”

The search giant posted earnings of $2.81 per share on $90.23 billion in revenues. That topped the $89.12 billion in sales and $2.01 in EPS expected by LSEG analysts. Revenues grew 12% year-over-year and ahead of the 10% anticipated by Wall Street.

Net income rose 46% to $34.54 billion, or $2.81 per share. That’s up from $23.66 billion, or $1.89 per share, in the year-ago period. Alphabet said the figure included $8 billion in unrealized gains on its nonmarketable equity securities connected to its investment in a private company.

Adjusted earnings, excluding that gain, were $2.27 per share, according to LSEG, and topped analyst expectations.

Read more CNBC tech news

Alphabet shares have pulled back about 16% this year as it battles volatility spurred by mounting trade war fears and worries that President Donald Trump‘s tariffs could crush the global economy. That would make it more difficult for Alphabet to potentially acquire infrastructure for data centers powering AI models as it faces off against competitors such as OpenAI and Anthropic to develop largely language models.

During Thursday’s call with investors, Alphabet suggested that it’s too soon to tally the total impact of tariffs. However, Google’s business chief Philipp Schindler said that ending the de minimis trade exemption in May, which created a loophole benefitting many Chinese e-commerce retailers, could create a “slight headwind” for the company’s ads business, specifically in the Asia-Pacific region. The loophole allows shipments under $800 to come into the U.S. duty-free.

Despite this backdrop, Alphabet showed steady growth in its advertising and search business, reporting $66.89 billion in revenues for its advertising unit. That reflected 8.5% growth from the year-ago period. The company reported $8.93 billion in advertising revenue for its YouTube business, shy of an $8.97 billion estimate from StreetAccount.

Alphabet’s “Search and other” unit rose 9.8% to $50.7 billion, up from $46.16 billion last year. The company said that its AI Overviews tool used in its Google search results page has accumulated 1.5 billion monthly users from a billion in October.

Bank of America analyst Justin Post said that Wall Street is underestimating the upside potential and “monetization ramp” from this tool and cloud demand fueled by AI.

“The strong 1Q search performance, along with constructive comments on Gemini [large language model] performance and [AI Overviews] adoption could help alleviate some investor concerns on AI competition,” Post wrote in a note.

WATCH: Gemini delivering well for Google, says Check Capital’s Chris Ballard

Gemini delivering well for Google, says Check Capital's Chris Ballard

CNBC’s Jennifer Elias contributed to this report.

Continue Reading

Trending