Nvidia CEO Jensen Huang speaks during a press conference at The MGM during CES 2018 in Las Vegas on January 7, 2018.
Mandel Ngan | AFP | Getty Images
Software that can write passages of text or draw pictures that look like a human created them has kicked off a gold rush in the technology industry.
Companies like Microsoft and Google are fighting to integrate cutting-edge AI into their search engines, as billion-dollar competitors such as OpenAI and Stable Diffusion race ahead and release their software to the public.
Powering many of these applications is a roughly $10,000 chip that’s become one of the most critical tools in the artificial intelligence industry: The Nvidia A100.
The A100 has become the “workhorse” for artificial intelligence professionals at the moment, said Nathan Benaich, an investor who publishes a newsletter and report covering the AI industry, including a partial list of supercomputers using A100s. Nvidia takes 95% of the market for graphics processors that can be used for machine learning, according to New Street Research.
The A100 is ideally suited for the kind of machine learning models that power tools like ChatGPT, Bing AI, or Stable Diffusion. It’s able to perform many simple calculations simultaneously, which is important for training and using neural network models.
The technology behind the A100 was initially used to render sophisticated 3D graphics in games. It’s often called a graphics processor, or GPU, but these days Nvidia’s A100 is configured and targeted at machine learning tasks and runs in data centers, not inside glowing gaming PCs.
Big companies or startups working on software like chatbots and image generators require hundreds or thousands of Nvidia’s chips, and either purchase them on their own or secure access to the computers from a cloud provider.
Hundreds of GPUsare required to train artificial intelligence models, like large language models. The chips need to be powerful enough to crunch terabytes of data quickly to recognize patterns. After that, GPUs like the A100 are also needed for “inference,” or using the model to generate text, make predictions, or identify objects inside photos.
This means that AI companies need access to a lot of A100s. Some entrepreneurs in the space even see the number of A100s they have access to as a sign of progress.
“A year ago we had 32 A100s,” Stability AI CEO Emad Mostaque wrote on Twitter in January. “Dream big and stack moar GPUs kids. Brrr.” Stability AI is the company that helped develop Stable Diffusion, an image generator that drew attention last fall, and reportedly has a valuation of over $1 billion.
Now, Stability AI has access to over 5,400 A100 GPUs, according to one estimate from the State of AI report, which charts and tracks which companies and universities have the largest collection of A100 GPUs — although it doesn’t include cloud providers, which don’t publish their numbers publicly.
Nvidia’s riding the A.I. train
Nvidia stands to benefit from the AI hype cycle. During Wednesday’s fiscal fourth-quarter earnings report, although overall sales declined 21%, investors pushed the stock up about 14% on Thursday, mainly because the company’s AI chip business — reported as data centers — rose by 11% to more than $3.6 billion in sales during the quarter, showing continued growth.
Nvidia shares are up 65% so far in 2023, outpacing the S&P 500 and other semiconductor stocks alike.
Nvidia CEO Jensen Huang couldn’t stop talking about AI on a call with analysts on Wednesday, suggesting that the recent boom in artificial intelligence is at the center of the company’s strategy.
“The activity around the AI infrastructure that we built, and the activity around inferencing using Hopper and Ampere to influence large language models has just gone through the roof in the last 60 days,” Huang said. “There’s no question that whatever our views are of this year as we enter the year has been fairly dramatically changed as a result of the last 60, 90 days.”
Ampere is Nvidia’s code name for the A100 generation of chips. Hopper is the code name for the new generation, including H100, which recently started shipping.
More computers needed
Nvidia A100 processor
Nvidia
Compared to other kinds of software, like serving a webpage, which uses processing power occasionally in bursts for microseconds, machine learning tasks can take up the whole computer’s processing power, sometimes for hours or days.
This means companies that find themselves with a hit AI product often need to acquire more GPUs to handle peak periods or improve their models.
These GPUs aren’t cheap. In addition to a single A100 on a card that can be slotted into an existing server, many data centers use a system that includes eight A100 GPUs working together.
This system, Nvidia’s DGX A100, has a suggested price of nearly $200,000, although it comes with the chips needed. On Wednesday, Nvidia said it would sell cloud access to DGX systems directly, which will likely reduce the entry cost for tinkerers and researchers.
It’s easy to see how the cost of A100s can add up.
For example, an estimate from New Street Research found that the OpenAI-based ChatGPT model inside Bing’s search could require 8 GPUs to deliver a response to a question in less than one second.
At that rate, Microsoft would need over 20,000 8-GPU servers just to deploy the model in Bing to everyone, suggesting Microsoft’s feature could cost $4 billion in infrastructure spending.
“If you’re from Microsoft, and you want to scale that, at the scale of Bing, that’s maybe $4 billion. If you want to scale at the scale of Google, which serves 8 or 9 billion queries every day, you actually need to spend $80 billion on DGXs.” said Antoine Chkaiban, a technology analyst at New Street Research. “The numbers we came up with are huge. But they’re simply the reflection of the fact that every single user taking to such a large language model requires a massive supercomputer while they’re using it.”
The latest version of Stable Diffusion, an image generator, was trained on 256 A100 GPUs, or 32 machines with 8 A100s each, according to information online posted by Stability AI, totaling 200,000 compute hours.
At the market price, training the model alone cost $600,000, Stability AI CEO Mostaque said on Twitter, suggesting in a tweet exchange the price was unusually inexpensive compared to rivals. That doesn’t count the cost of “inference,” or deploying the model.
Huang, Nvidia’s CEO, said in an interview with CNBC’s Katie Tarasov that the company’s products are actually inexpensive for the amount of computation that these kinds of models need.
“We took what otherwise would be a $1 billion data center running CPUs, and we shrunk it down into a data center of $100 million,” Huang said. “Now, $100 million, when you put that in the cloud and shared by 100 companies, is almost nothing.”
Huang said that Nvidia’s GPUs allow startups to train models for a much lower cost than if they used a traditional computer processor.
“Now you could build something like a large language model, like a GPT, for something like $10, $20 million,” Huang said. “That’s really, really affordable.”
New competition
Nvidia isn’t the only company making GPUs for artificial intelligence uses. AMD and Intel have competing graphics processors, and big cloud companies like Google and Amazon are developing and deploying their own chips specially designed for AI workloads.
Still, “AI hardware remains strongly consolidated to NVIDIA,” according to the State of AI compute report. As of December, more than 21,000 open-source AI papers said they used Nvidia chips.
Most researchersincluded in the State of AI Compute Index used the V100, Nvidia’s chip that came out in 2017, but A100 grew fast in 2022 to be the third-most used Nvidia chip, just behind a $1500-or-less consumer graphics chip originally intended for gaming.
The A100 also has the distinction of being one of only a few chips to have export controls placed on it because of national defense reasons. Last fall, Nvidia said in an SEC filing that the U.S. government imposed a license requirement barring the export of the A100 and the H100 to China, Hong Kong, and Russia.
“The USG indicated that the new license requirement will address the risk that the covered products may be used in, or diverted to, a ‘military end use’ or ‘military end user’ in China and Russia,” Nvidia said in its filing. Nvidia previously said it adapted some of its chips for the Chinese market to comply with U.S. export restrictions.
The fiercest competition for the A100 may be its successor. The A100 was first introduced in 2020, an eternity ago in chip cycles. The H100, introduced in 2022, is starting to be produced in volume — in fact, Nvidia recorded more revenue from H100 chips in the quarter ending in January than the A100, it said on Wednesday, although the H100 is more expensive per unit.
The H100, Nvidia says, is the first one of its data center GPUs to be optimized for transformers, an increasingly important technique that many of the latest and top AI applications use. Nvidia said on Wednesday that it wants to make AI training over 1 million percent faster. That could mean that, eventually, AI companies wouldn’t need so many Nvidia chips.
Elon Musk’s SpaceX, is initiating a secondary share sale that would give the company a valuation of up to $800 billion, The Wall Street Journal reported Friday.
SpaceX is also telling some investors it will consider going public possibly around the end of next year, the report said.
At the elevated price, Musk’s aerospace and defense contractor would be valued above ChatGPT maker OpenAI, which wrapped up a share sale at a $500 billion valuation in October.
SpaceX has been investing heavily in reusable rockets, launch facilities and satellites, while competing for government contracts with newer space players, including Jeff Bezos‘ Blue Origin. SpaceX is far ahead, and operates the world’s largest network of satellites in low earth orbit through Starlink, which powers satellite internet services under the same brand name.
A SpaceX IPO would include its Starlink business, which the company previously considered spinning out.
Musk recently discussed whether SpaceX would go public during Tesla‘s annual shareholders meeting last month. Musk, who is the CEO of both companies, said he doesn’t love running publicly traded businesses, in part because they draw “spurious lawsuits,” and can “make it very difficult to operate effectively.”
However, Musk said during the meeting that he wanted to “try to figure out some way for Tesla shareholders to participate in SpaceX,” adding, “maybe at some point, SpaceX should become a public company despite all the downsides.”
The logo for Google LLC is seen at the Google Store Chelsea in Manhattan, New York City, U.S., November 17, 2021.
Andrew Kelly | Reuters
A U.S. judge on Friday finalized his decision for the consequences Google will face for its search monopoly ruling, adding new details to the decided remedies.
Last year, Google was found to hold an illegal monopoly in its core market of internet search, and in September, U.S. District Judge Amit Mehta ruled against the most severe consequences that were proposed by the Department of Justice.
That included the proposal of a forced sale of Google’s Chrome browser, which provides data that helps the company’s advertising business deliver targeted ads. Alphabet shares popped 8% in extended trading as investors celebrated what they viewed as minimal consequences from a historic defeat last year in the landmark antitrust case.
Investors largely shrugged off the ruling as non-impactful to Google. However some told CNBC it’s still a bite that could “sting.”
Mehta on Friday issued additional details for his ruling in new filings.
“The age-old saying ‘the devil is in the details’ may not have been devised with the drafting of an antitrust remedies judgment in mind, but it sure does fit,” Mehta wrote in one of the Friday filings.
Google did not immediately respond to a request for comment. The company has previously said it will appeal the remedies.
In August 2024, Mehta ruled that Google violated Section 2 of the Sherman Act and held a monopoly in search and related advertising. The antitrust trial started in September 2023.
In his September decision, Mehta said the company would be able to make payments to preload products, but it could not have exclusive contracts that condition payments or licensing. Google was also ordered to loosen its hold on search data. Mehta in September also ruled that Google would have to make available certain search index data and user interaction data, though “not ads data.”
The DOJ had asked Google to stop the practice of “compelled syndication,” which refers to the practice of making certain deals with companies to ensure its search engine remains the default choice in browsers and smartphones.
The judge’s September ruling didn’t end the practice entirely — Mehta ruled out that Google couldn’t enter into exclusive deals, which was a win for the company. Google pays Apple billions of dollars per year to be the default search engine on iPhones. It’s lucrative for Apple and a valuable way for Google to get more search volume and users.
Mehta’s new details
In the Friday filings, Mehta wrote that Google cannot enter into any deal like the one it’s had with Apple “unless the agreement terminates no more than one year after the date it is entered.”
This includes deals involving generative artificial intelligence products, including any “application, software, service, feature, tool, functionality, or product” that involve or use genAI or large-language models, Mehta wrote.
GenAI “plays a significant role in these remedies,” Mehta wrote.
The judge also reiterated the web index data it will require Google to share with certain competitors.
Google has to share some of the raw search interaction data it uses to train its ranking and AI systems, but it does not have to share the actual algorithms — just the data that feeds them.” In September, Mehta said those data sets represent a “small fraction” of Google’s overall traffic, but argued the company’s models are trained on data that contributed to Google’s edge over competitors.
The company must make this data available to qualified competitors at least twice, one of the Friday filing states. Google must share that data in a “syndication license” model whose term will be five years from the date the license is signed, the filing states.
Mehta on Friday also included requirements on the makeup of a technical committee that will determine the firms Google must share its data with.
Committee “members shall be experts in some combination of software engineering, information retrieval, artificial intelligence, economics, behavioral science, and data privacy and data security,” the filing states.
The judge went on to say that no committee member can have a conflict of interest, such as having worked for Google or any of its competitors in the six months prior to or one year after serving in the role.
Google is also required to appoint an internal compliance officer that will be responsible “for administering Google’s antitrust compliance program and helping to ensure compliance with this Final Judgment,” per one of the filings. The company must also appoint a senior business executive “whom Google shall make available to update the Court on Google’s compliance at regular status conferences or as otherwise ordered.”
Amazon made plenty of news this week — from advances in the cloud business to questions about its partnership with the U.S. Postal Service — leaving investors with a lot to digest. The flurry of headlines comes at the end of a challenging year. The e-commerce and cloud giant’s stock is up 4.6%, compared to the broad market S & P 500’s 16.4%, and well behind all of its Magnificent Seven peers. Despite the company showing reaccelerating growth in AWS and enhancements to its dominant Prime e-commerce ecosystem, investors remain concerned that it is losing ground in the AI race and could face margin pressure from tariffs. We believe the company has turned a corner. “A better year is ahead as management continues to prove out its AI strategy and expand operating margins,” Jeff Marks, portfolio director for Club, wrote in a report on Thursday, highlighting stocks that are set up for a bounce back in 2026. Here’s how this week’s news fits into that investment thesis: Upbeat updates at cloud event News: During Amazon ‘s annual re:Invent 2025 conference in Las Vegas, Amazon Web Services CEO Matt Garman unveiled Trainium3 , the latest version of the company’s in-house custom chip. It delivers four times the compute performance, energy efficiency, and memory bandwidth of previous generations. AWS also announced that it is already working on Trainium4. The company also revealed a series of cloud products, including advanced AI-driven platforms and agents that help customers automate workloads. Our take: We were pleased to hear that AWS continues to innovate its chip offerings to diversify its reliance on Nvidia , the industry leader in graphics processing units (GPUs). However, most of the investor focus is on bringing data center capacity online. Amazon needs to buy more Nvidia chips to catch up in AI. Also, Jim Cramer interviewed AWS CEO Matt Garman on “Mad Money” earlier this week, who was upbeat about the future growth of the cloud business. USPS ties tested News: According to a Washington Post report, Amazon could sever its relationship with the USPS when its contract expires in October 2026. Amazon likely considered the move, as it already has a shadow postal service, Amazon Logistics, that handles billions of packages annually. By removing USPS as the middleman, Amazon would have complete financial and operational control. Amazon refuted the report . Our take: For years, the e-commerce and cloud giant invested billions of dollars to build a vast logistics network that is now delivering more packages in the U.S. than UPS and FedEx . It still uses the USPS for delivery of small, low-weight packages, especially those from third-party Amazon sellers. USPS is also helpful for “last-mile delivery” in difficult-to-serve geographic areas. If the company were to eliminate the Postal Service as a middleman, it could further reduce its cost to serve, thereby improving margins. Possible IPO payday News: Anthropic, the AI startup behind the Claude chatbot, is reportedly in talks to launch one of the biggest IPOs ever in early 2026, according to the Financial Times. Anthropic responded that it had no immediate plans for an IPO and instead is “keeping our options open,” Anthropic chief communications officer Sasha de Marigny said at an Axios event in New York City on Thursday. Our take: An Anthropic public offering could be a massive payday for Amazon, which has invested about $8 billion in Anthropic. As part of that investment, Anthropic partnered with AWS as its primary cloud provider and training partner to run its massive AI training and inference workloads. An Anthropic IPO would elevate the AI startup and thereby enhance AWS’s dominance as the best-in-class cloud provider. Ultra-fast grocery delivery News: Amazon said it is testing an ultra-fast delivery service for fresh groceries, everyday essentials, and popular items, available in as little as 30 minutes, starting in Seattle and Philadelphia. Amazon Prime members get discounted delivery fees starting at $3.99 per order, compared with $13.99 for non-Prime customers. Club take: Amazon has continued to expand into online grocery and essentials, as customers increasingly opt to shop for daily essentials with the online retailer. While the retail business comes with thin margins, Amazon continues to operate it with an eye on reducing its cost to serve, which should help improve margins over time. Amazon is already second in line as the top U.S. retailer, right behind Walmart in terms of U.S. online grocery sales. As it continues to make headway in the industry, Amazon should be able to capitalize on this significant growth opportunity, especially as it harnesses its advanced AI capabilities for optimal inventory placement and demand forecasting. (Jim Cramer’s Charitable Trust is long AMZN, NVDA. See here for a full list of the stocks.) As a subscriber to the CNBC Investing Club with Jim Cramer, you will receive a trade alert before Jim makes a trade. Jim waits 45 minutes after sending a trade alert before buying or selling a stock in his charitable trust’s portfolio. If Jim has talked about a stock on CNBC TV, he waits 72 hours after issuing the trade alert before executing the trade. THE ABOVE INVESTING CLUB INFORMATION IS SUBJECT TO OUR TERMS AND CONDITIONS AND PRIVACY POLICY , TOGETHER WITH OUR DISCLAIMER . NO FIDUCIARY OBLIGATION OR DUTY EXISTS, OR IS CREATED, BY VIRTUE OF YOUR RECEIPT OF ANY INFORMATION PROVIDED IN CONNECTION WITH THE INVESTING CLUB. NO SPECIFIC OUTCOME OR PROFIT IS GUARANTEED.