OpenAI CEO Sam Altman speaks during a keynote address announcing ChatGPT integration for Bing at Microsoft in Redmond, Washington, on February 7, 2023.
Jason Redmond | AFP | Getty Images
Before OpenAI’s ChatGPT emerged and captured the world’s attention for its ability to create compelling sentences, a small startup called Latitude was wowing consumers with its AI Dungeon game that let them use artificial intelligence to create fantastical tales based on their prompts.
But as AI Dungeon became more popular, Latitude CEO Nick Walton recalled that the cost to maintain the text-based role-playing game began to skyrocket. AI Dungeon’s text-generation software was powered by the GPT language technology offered by the Microsoft-backed AI research lab OpenAI. The more people played AI Dungeon, the bigger the bill Latitude had to pay OpenAI.
Compounding the predicament was that Walton also discovered content marketers were using AI Dungeon to generate promotional copy, a use for AI Dungeon that his team never foresaw, but that ended up adding to the company’s AI bill.
At its peak in 2021, Walton estimates Latitude was spending nearly $200,000 a month on OpenAI’s so-called generative AI software and Amazon Web Services in order to keep up with the millions of user queries it needed to process each day.
“We joked that we had human employees and we had AI employees, and we spent about as much on each of them,” Walton said. “We spent hundreds of thousands of dollars a month on AI and we are not a big startup, so it was a very massive cost.”
By the end of 2021, Latitude switched from using OpenAI’s GPT software to a cheaper but still capable language software offered by startup AI21 Labs, Walton said, adding that the startup also incorporated open source and free language models into its service to lower the cost. Latitude’s generative AI bills have dropped to under $100,000 a month, Walton said, and the startup charges players a monthly subscription for more advanced AI features to help reduce the cost.
Latitude’s pricey AI bills underscore an unpleasant truth behind the recent boom in generative AI technologies: The cost to develop and maintain the software can be extraordinarily high, both for the firms that develop the underlying technologies, generally referred to as a large language or foundation models, and those that use the AI to power their own software.
The high cost of machine learning is an uncomfortable reality in the industry as venture capitalists eye companies that could potentially be worth trillions, and big companies such as Microsoft, Meta, and Google use their considerable capital to develop a lead in the technology that smaller challengers can’t catch up to.
But if the margin for AI applications is permanently smaller than previous software-as-a-service margins, because of the high cost of computing, it could put a damper on the current boom.
The high cost of training and “inference” — actually running — large language models is a structural cost that differs from previous computing booms. Even when the software is built, or trained, it still requires a huge amount of computing power to run large language models because they do billions of calculations every time they return a response to a prompt. By comparison, serving web apps or pages requires much less calculation.
These calculations also require specialized hardware. While traditional computer processors can run machine learning models, they’re slow. Most training and inference now takes place on graphics processors, or GPUs, which were initially intended for 3D gaming, but have become the standard for AI applications because they can do many simple calculations simultaneously.
Nvidia makes most of the GPUs for the AI industry, and its primary data center workhorse chip costs $10,000. Scientists that build these models often joke that they “melt GPUs.”
Training models
Nvidia A100 processor
Nvidia
Analysts and technologists estimate that the critical process of training a large language model such as OpenAI’s GPT-3 could cost more than $4 million. More advanced language models could cost over “the high-single-digit millions” to train, said Rowan Curran, a Forrester analyst who focuses on AI and machine learning.
Meta’s largest LLaMA model released last month, for example, used 2,048 Nvidia A100 GPUs to train on 1.4 trillion tokens (750 words is about 1,000 tokens), taking about 21 days, the company said when it released the model last month.
It took about 1 million GPU hours to train. With dedicated prices from AWS, that would cost over $2.4 million. And at 65 billion parameters, it’s smaller than the current GPT models at OpenAI, like ChatGPT-3, which has 175 billion parameters.
Clement Delangue, the CEO of AI startup Hugging Face, said the process of training the company’s Bloom large language model took more than two-and-a-half months and required access to a supercomputer that was “something like the equivalent of 500 GPUs.”
Organizations that build large language models must be cautious when they retrain the software, which helps improve its abilities, because it costs so much, he said.
“It’s important to realize that these models are not trained all the time, like every day,” Delangue said, noting that’s why some models, such as ChatGPT, don’t have knowledge of recent events. ChatGPT’s knowledge stops in 2021, he said.
“We are actually doing a training right now for the version two of Bloom and it’s gonna cost no more than $10 million to retrain,” Delangue said. “So that’s the kind of thing that we don’t want to do every week.”
Inference and who pays for it
Bing with Chat
Jordan Novet | CNBC
To use a trained machine learning model to make predictions or generate text, engineers use the model in a process called “inference,” which can be much more expensive than training because it might need to run millions of times for a popular product.
For a product as popular as ChatGPT — which investment firm UBS estimates to have reached 100 million monthly active users in January — Curran believes that it could have cost OpenAI $40 million to process the millions of prompts people fed into the software that month.
Costs skyrocket when these tools are used billions of times a day. Financial analysts estimate Microsoft’s Bing AI chatbot, which is powered by an OpenAI ChatGPT model, needs at least $4 billion of infrastructure to serve responses to all Bing users.
In the case of Latitude, for instance, while the startup didn’t have to pay to train the underlying OpenAI language model it was accessing, it had to account for the inferencing costs that were something akin to “half-a-cent per call” on “a couple million requests per day,” a Latitude spokesperson said.
“And I was being relatively conservative,” Curran said of his calculations.
In order to sow the seeds of the current AI boom, venture capitalists and tech giants have been investing billions of dollars into startups that specialize in generative AI technologies. Microsoft, for instance, invested as much as $10 billion into GPT’s overseer OpenAI, according to media reports in January. Salesforce‘s venture capital arm, Salesforce Ventures, recently debuted a $250 million fund that caters to generative AI startups.
As investor Semil Shah of the VC firms Haystack and Lightspeed Venture Partners described on Twitter, “VC dollars shifted from subsidizing your taxi ride and burrito delivery to LLMs and generative AI compute.”
Many entrepreneurs see risks in relying on potentially subsidized AI models that they don’t control and merely pay for on a per-use basis.
“When I talk to my AI friends at the startup conferences, this is what I tell them: Do not solely depend on OpenAI, ChatGPT or any other large language models,” said Suman Kanuganti, founder of personal.ai, a chatbot currently in beta mode. “Because businesses shift, they are all owned by big tech companies, right? If they cut access, you’re gone.”
Companies such as enterprise tech firm Conversica are exploring how they can use the tech through Microsoft’s Azure cloud service at its currently discounted price.
While Conversica CEO Jim Kaskade declined to comment about how much the startup is paying, he conceded that the subsidized cost is welcome as it explores how language models can be used effectively.
“If they were truly trying to break even, they’d be charging a hell of a lot more,” Kaskade said.
How it could change
It’s unclear if AI computation will stay expensive as the industry develops. Companies making the foundation models, semiconductor makers and startups all see business opportunities in reducing the price of running AI software.
Nvidia, which has about 95% of the market for AI chips, continues to develop more powerful versions designed specifically for machine learning, but improvements in total chip power across the industry have slowed in recent years.
Still, Nvidia CEO Jensen Huang believes that in 10 years, AI will be “a million times” more efficient because of improvements not only in chips, but also in software and other computer parts.
“Moore’s Law, in its best days, would have delivered 100x in a decade,” Huang said last month on an earnings call. “By coming up with new processors, new systems, new interconnects, new frameworks and algorithms, and working with data scientists, AI researchers on new models, across that entire span, we’ve made large language model processing a million times faster.”
Some startups have focused on the high cost of AI as a business opportunity.
“Nobody was saying ‘You should build something that was purpose-built for inference.’ What would that look like?” said Sid Sheth, founder of D-Matrix, a startup building a system to save money on inference by doing more processing in the computer’s memory, as opposed to on a GPU.
“People are using GPUs today, NVIDIA GPUs, to do most of their inference. They buy the DGX systems that NVIDIA sells that cost a ton of money. The problem with inference is if the workload spikes very rapidly, which is what happened to ChatGPT, it went to like a million users in five days. There is no way your GPU capacity can keep up with that because it was not built for that. It was built for training, for graphics acceleration,” he said.
Delangue, the HuggingFace CEO, believes more companies would be better served focusing on smaller, specific models that are cheaper to train and run, instead of the large language models that are garnering most of the attention.
Meanwhile, OpenAI announced last month that it’s lowering the cost for companies to access its GPT models. It now charges one-fifth of one cent for about 750 words of output.
OpenAI’s lower prices have caught the attention of AI Dungeon-maker Latitude.
“I think it’s fair to say that it’s definitely a huge change we’re excited to see happen in the industry and we’re constantly evaluating how we can deliver the best experience to users,” a Latitude spokesperson said. “Latitude is going to continue to evaluate all AI models to be sure we have the best game out there.”
Alex Karp, CEO of Palantir Technologies speaks during the Digital X event on September 07, 2021 in Cologne, Germany.
Andreas Rentz | Getty Images
Palantir shares continued their torrid run on Friday, soaring as much as 9% to a record, after the developer of software for the military announced plans to transfer its listing to the Nasdaq from the New York Stock Exchange.
The stock jumped past $64.50 in afternoon trading, lifting the company’s market cap to $147 billion. The shares are now up more than 50% since Palantir’s better-than-expected earnings report last week and have almost quadrupled in value this year.
Palantir said late Thursday that it expects to begin trading on the Nasdaq on Nov. 26, under its existing ticker symbol “PLTR.” While changing listing sites does nothing to alter a company’s fundamentals, board member Alexander Moore, a partner at venture firm 8VC, suggested in a post on X that the move could be a win for retail investors because “it will force” billions of dollars in purchases by exchange-traded funds.
“Everything we do is to reward and support our retail diamondhands following,” Moore wrote, referring to a term popularized in the crypto community for long-term believers.
Moore appears to have subsequently deleted his X account. His firm, 8VC, didn’t immediately respond to a request for comment.
Last Monday after market close, Palantir reported third-quarter earnings and revenue that topped estimates and issued a fourth-quarter forecast that was also ahead of Wall Street’s expectations. CEO Alex Karp wrote in the earnings release that the company “absolutely eviscerated this quarter,” driven by demand for artificial intelligence technologies.
U.S. government revenue increased 40% from a year earlier to $320 million, while U.S. commercial revenue rose 54% to $179 million. On the earnings call, the company highlighted a five-year contract to expand its Maven technology across the U.S. military. Palantir established Maven in 2017 to provide AI tools to the Department of Defense.
The post-earnings rally coincides with the period following last week’s presidential election. Palantir is seen as a potential beneficiary given the company’s ties to the Trump camp. Co-founder and Chairman Peter Thiel was a major booster of Donald Trump’s first victorious campaign, though he had a public falling out with Trump in the ensuing years.
When asked in June about his position on the 2024 election, Thiel said, “If you hold a gun to my head I’ll vote for Trump.”
Thiel’s Palantir holdings have increased in value by about $3.2 billion since the earnings report and $2 billion since the election.
In September, S&P Global announced Palantir would join the S&P 500 stock index.
Analysts at Argus Research say the rally has pushed the stock too high given the current financials and growth projections. The analysts still have a long-term buy rating on the stock and said in a report last week that the company had a “stellar” quarter, but they downgraded their 12-month recommendation to a hold.
The stock “may be getting ahead of what the company fundamentals can support,” the analysts wrote.
Charles Liang, chief executive officer of Super Micro Computer Inc., during the Computex conference in Taipei, Taiwan, on Wednesday, June 5, 2024. The trade show runs through June 7.
Annabelle Chih | Bloomberg | Getty Images
Super Micro Computer could be headed down a path to getting kicked off the Nasdaq as soon as Monday.
That’s the potential fate for the server company if it fails to file a viable plan for becoming compliant with Nasdaq regulations. Super Micro is late in filing its 2024 year-end report with the SEC, and has yet to replace its accounting firm. Many investors were expecting clarity from Super Micro when the company reported preliminary quarterly results last week. But they didn’t get it.
The primary component of that plan is how and when Super Micro will file its 2024 year-end report with the Securities and Exchange Commission, and why it was late. That report is something many expected would be filed alongside the company’s June fourth-quarter earnings but was not.
The Nasdaq delisting process represents a crossroads for Super Micro, which has been one of the primary beneficiaries of the artificial intelligence boom due to its longstanding relationship with Nvidia and surging demand for the chipmaker’s graphics processing units.
The one-time AI darling is reeling after a stretch of bad news. After Super Micro failed to file its annual report over the summer, activist short seller Hindenburg Research targeted the company in August, alleging accounting fraud and export control issues. The company’s auditor, Ernst & Young, stepped down in October, and Super Micro said last week that it was still trying to find a new one.
The stock is getting hammered. After the shares soared more than 14-fold from the end of 2022 to their peak in March of this year, they’ve since plummeted by 85%. Super Micro’s stock is now equal to where it was trading in May 2022, after falling another 11% on Thursday.
Getting delisted from the Nasdaq could be next if Super Micro doesn’t file a compliance plan by the Monday deadline or if the exchange rejects the company’s submission. Super Micro could also get an extension from the Nasdaq, giving it months to come into compliance. The company said Thursday that it would provide a plan to the Nasdaq in time.
A spokesperson told CNBC the company “intends to take all necessary steps to achieve compliance with the Nasdaq continued listing requirements as soon as possible.”
While the delisting issue mainly affects the stock, it could also hurt Super Micro’s reputation and standing with its customers, who may prefer to simply avoid the drama and buy AI servers from rivals such as Dell or HPE.
“Given that Super Micro’s accounting concerns have become more acute since Super Micro’s quarter ended, its weakness could ultimately benefit Dell more in the coming quarter,” Bernstein analyst Toni Sacconaghi wrote in a note this week.
A representative for the Nasdaq said the exchange doesn’t comment on the delisting process for individual companies, but the rules suggest the process could take about a year before a final decision.
A plan of compliance
The Nasdaq warned Super Micro on Sept. 17 that it was at risk of being delisted. That gave the company 60 days to submit a plan of compliance to the exchange, and because the deadline falls on a Sunday, the effective date for the submission is Monday.
If Super Micro’s plan is acceptable to Nasdaq staff, the company is eligible for an extension of up to 180 days to file its year-end report. The Nasdaq wants to see if Super Micro’s board of directors has investigated the company’s accounting problem, what the exact reason for the late filing was and a timeline of actions taken by the board.
The Nasdaq says it looks at several factors when evaluating a plan of compliance, including the reasons for the late filing, upcoming corporate events, the overall financial status of the company and the likelihood of a company filing an audited report within 180 days. The review can also look at information provided by outside auditors, the SEC or other regulators.
Last week, Super Micro said it was doing everything it could to remain listed on the Nasdaq, and said a special committee of its board had investigated and found no wrongdoing. Super Micro CEO Charles Liang said the company would receive the board committee’s report as soon as last week. A company spokesperson didn’t respond when asked by CNBC if that report had been received.
If the Nasdaq rejects Super Micro’s compliance plan, the company can request a hearing from the exchange’s Hearings Panel to review the decision. Super Micro won’t be immediately kicked off the exchange – the hearing panel request starts a 15-day stay for delisting, and the panel can decide to extend the deadline for up to 180 days.
If the panel rejects that request or if Super Micro gets an extension and fails to file the updated financials, the company can still appeal the decision to another Nasdaq body called the Listing Council, which can grant an exception.
Ultimately, the Nasdaq says the extensions have a limit: 360 days from when the company’s first late filing was due.
A poor track record
There’s one factor at play that could hurt Super Micro’s chances of an extension. The exchange considers whether the company has any history of being out of compliance with SEC regulations.
Between 2015 and 2017, Super Micro misstated financials and published key filings late, according to the SEC. It was delisted from the Nasdaq in 2017 and was relisted two years later.
Super Micro “might have a more difficult time obtaining extensions as the Nasdaq’s literature indicates it will in part ‘consider the company’s specific circumstances, including the company’s past compliance history’ when determining whether an extension is warranted,” Wedbush analyst Matt Bryson wrote in a note earlier this month. He has a neutral rating on the stock.
History also reveals just how long the delisting process can take.
Charles Liang, chief executive officer of Super Micro Computer Inc., right, and Jensen Huang, co-founder and chief executive officer of Nvidia Corp., during the Computex conference in Taipei, Taiwan, on Wednesday, June 5, 2024.
Annabelle Chih | Bloomberg | Getty Images
Super Micro missed an annual report filing deadline in June 2017, got an extension to December and finally got a hearing in May 2018, which gave it another extension to August of that year. It was only when it missed that deadline that the stock was delisted.
In the short term, the bigger worry for Super Micro is whether customers and suppliers start to bail.
Aside from the compliance problems, Super Micro is a fast-growing company making one of the most in-demand products in the technology industry. Sales more than doubled last year to nearly $15 billion, according to unaudited financial reports, and the company has ample cash on its balance sheet, analysts say. Wall Street is expecting even more growth to about $25 billion in sales in its fiscal 2025, according to FactSet.
Super Micro said last week that the filing delay has “had a bit of an impact to orders.” In its unaudited September quarter results reported last week, the company showed growth that was slower than Wall Street expected. It also provided light guidance.
The company said one reason for its weak results was that it hadn’t yet obtained enough supply of Nvidia’s next-generation chip, called Blackwell, raising questions about Super Micro’s relationship with its most important supplier.
“We don’t believe that Super Micro’s issues are a big deal for Nvidia, although it could move some sales around in the near term from one quarter to the next as customers direct orders toward Dell and others,” wrote Melius Research analyst Ben Reitzes in a note this week.
Super Micro’s head of corporate development, Michael Staiger, told investors on a call last week that “we’ve spoken to Nvidia and they’ve confirmed they’ve made no changes to allocations. We maintain a strong relationship with them.”
Chinese e-commerce behemoth Alibaba on Friday beat profit expectations in its September quarter, but sales fell short as sluggishness in the world’s second-largest economy hit consumer spending.
Alibaba said net income rose 58% year on year to 43.9 billion yuan ($6.07 billion) in the company’s quarter ended Sept. 30, on the back of the performance of its equity investments. This compares with an LSEG forecast of 25.83 billion yuan.
“The year-over-year increases were primarily attributable to the mark-to-market changes from our equity investments, decrease in impairment of our investments and increase in income from operations,” the company said of the annual profit jump in its earnings statement.
Revenue, meanwhile, came in at 236.5 billion yuan, 5% higher year on year but below an analyst forecast of 238.9 billion yuan, according to LSEG data.
The company’s New York-listed shares have gained ground this year to date, up more than 13%. The stock fell more than 2% in morning trading on Friday, after the release of the quarterly earnings.
Sales sentiment
Investors are closely watching the performance of Alibaba’s main business units, Taobao and Tmall Group, which reported a 1% annual uptick in revenue to 98.99 billion yuan in the September quarter.
The results come at a tricky time for Chinese commerce businesses, given a tepid retail environment in the country. Chinese e-commerce group JD.com also missed revenue expectations on Thursday, according to Reuters.
Markets are now watching whether a slew of recent stimulus measures from Beijing, including a five-year 1.4 trillion yuan package announced last week, will help resuscitate the country’s growth and curtail a long-lived real estate market slump.
The impact on the retail space looks promising so far, with sales rising by a better-than-expected 4.8% year on year in October, while China’s recent Singles’ Day shopping holiday — widely seen as a barometer for national consumer sentiment — regained some of its luster.
Alibaba touted “robust growth” in gross merchandise volume — an industry measure of sales over time that does not equate to the company’s revenue — for its Taobao and Tmall Group businesses during the festival, along with a “record number of active buyers.”
“Alibaba’s outlook remains closely aligned with the trajectory of the Chinese economy and evolving regulatory policies,” ING analysts said Thursday, noting that the company’s Friday report will shed light on the Chinese economy’s growth momentum.
The e-commerce giant’s overseas online shopping businesses, such as Lazada and Aliexpress, meanwhile posted a 29% year-on-year hike in sales to 31.67 billion yuan.
Cloud business accelerates
Alibaba’s Cloud Intelligence Group reported year-on-year sales growth of 7% to 29.6 billion yuan in the September quarter, compared with a 6% annual hike in the three-month period ended in June. The slight acceleration comes amid ongoing efforts by the company to leverage its cloud infrastructure and reposition itself as a leader in the booming artificial intelligence space.
“Growth in our Cloud business accelerated from prior quarters, with revenues from public cloud products growing in double digits and AI-related product revenue delivering triple-digit growth. We are more confident in our core businesses than ever and will continue to invest in supporting long-term growth,” Alibaba CEO Eddie Wu said in a statement Friday.
Stymied by Beijing’s sweeping 2022 crackdown on large internet and tech companies, Alibaba last year overhauled the division’s leadership and has been shaping it as a future growth driver, stepping up competition with rivals including Baidu and Huawei domestically, and Microsoft and OpenAI in the U.S.
Alibaba, which rolled out its own ChatGPT-style product Tongyi Qianwen last year, this week unveiled its own AI-powered search tool for small businesses in Europe and the Americas, and clinched a key five-year partnership to supply cloud services to Indonesian tech giant GoTo in September.
Speaking at the Apsara Conference in September, Alibaba’s Wu said the company’s cloud unit is investing “with unprecedented intensity, in the research and development of AI technology and the building of its global infrastructure,” noting that the future of AI is “only beginning.”
Correction: This article has been updated to reflect that Alibaba’s Cloud Intelligence Group reported quarterly revenue of 29.6 billion yuan in the September quarter.