In an unmarked office building in Austin, Texas, two small rooms contain a handful of Amazon employees designing two types of microchips for training and accelerating generative AI. These custom chips, Inferentia and Trainium, offer AWS customers an alternative to training their large language models on Nvidia GPUs, which have been getting difficult and expensive to procure.
“The entire world would like more chips for doing generative AI, whether that’s GPUs or whether that’s Amazon’s own chips that we’re designing,” Amazon Web Services CEO Adam Selipsky told CNBC in an interview in June. “I think that we’re in a better position than anybody else on Earth to supply the capacity that our customers collectively are going to want.”
Yet others have acted faster, and invested more, to capture business from the generative AI boom. When OpenAI launched ChatGPT in November, Microsoft gained widespread attention for hosting the viral chatbot, and investing a reported $13 billion in OpenAI. It was quick to add the generative AI models to its own products, incorporating them into Bing in February.
That same month, Google launched its own large language model, Bard, followed by a $300 million investment in OpenAI rival Anthropic.
It wasn’t until April that Amazon announced its own family of large language models, called Titan, along with a service called Bedrock to help developers enhance software using generative AI.
“Amazon is not used to chasing markets. Amazon is used to creating markets. And I think for the first time in a long time, they are finding themselves on the back foot and they are working to play catch up,” said Chirag Dekate, VP analyst at Gartner.
In the long run, Dekate said, Amazon’s custom silicon could give it an edge in generative AI.
“I think the true differentiation is the technical capabilities that they’re bringing to bear,” he said. “Because guess what? Microsoft does not have Trainium or Inferentia,” he said.
AWS quietly started production of custom silicon back in 2013 with a piece of specialized hardware called Nitro. It’s now the highest-volume AWS chip. Amazon told CNBC there is at least one in every AWS server, with a total of more than 20 million in use.
AWS started production of custom silicon back in 2013 with this piece of specialized hardware called Nitro. Amazon told CNBC in August that Nitro is now the highest volume AWS chip, with at least one in every AWS server and a total of more than 20 million in use.
Courtesy Amazon
In 2015, Amazon bought Israeli chip startup Annapurna Labs. Then in 2018, Amazon launched its Arm-based server chip, Graviton, a rival to x86 CPUs from giants like AMD and Intel.
“Probably high single-digit to maybe 10% of total server sales are Arm, and a good chunk of those are going to be Amazon. So on the CPU side, they’ve done quite well,” said Stacy Rasgon, senior analyst at Bernstein Research.
Also in 2018, Amazon launched its AI-focused chips. That came two years after Google announced its first Tensor Processor Unit, or TPU. Microsoft has yet to announce the Athena AI chip it’s been working on, reportedly in partnership with AMD.
CNBC got a behind-the-scenes tour of Amazon’s chip lab in Austin, Texas, where Trainium and Inferentia are developed and tested. VP of product Matt Wood explained what both chips are for.
“Machine learning breaks down into these two different stages. So you train the machine learning models and then you run inference against those trained models,” Wood said. “Trainium provides about 50% improvement in terms of price performance relative to any other way of training machine learning models on AWS.”
Trainium first came on the market in 2021, following the 2019 release of Inferentia, which is now on its second generation.
Inferentia allows customers “to deliver very, very low-cost, high-throughput, low-latency, machine learning inference, which is all the predictions of when you type in a prompt into your generative AI model, that’s where all that gets processed to give you the response, ” Wood said.
For now, however, Nvidia’s GPUs are still king when it comes to training models. In July, AWS launched new AI acceleration hardware powered by Nvidia H100s.
“Nvidia chips have a massive software ecosystem that’s been built up around them over the last like 15 years that nobody else has,” Rasgon said. “The big winner from AI right now is Nvidia.”
Amazon’s custom chips, from left to right, Inferentia, Trainium and Graviton are shown at Amazon’s Seattle headquarters on July 13, 2023.
Joseph Huerta
Leveraging cloud dominance
AWS’ cloud dominance, however, is a big differentiator for Amazon.
“Amazon does not need to win headlines. Amazon already has a really strong cloud install base. All they need to do is to figure out how to enable their existing customers to expand into value creation motions using generative AI,” Dekate said.
When choosing between Amazon, Google, and Microsoft for generative AI, there are millions of AWS customers who may be drawn to Amazon because they’re already familiar with it, running other applications and storing their data there.
“It’s a question of velocity. How quickly can these companies move to develop these generative AI applications is driven by starting first on the data they have in AWS and using compute and machine learning tools that we provide,” explained Mai-Lan Tomsen Bukovec, VP of technology at AWS.
AWS is the world’s biggest cloud computing provider, with 40% of the market share in 2022, according to technology industry researcher Gartner. Although operating income has been down year-over-year for three quarters in a row, AWS still accounted for 70% of Amazon’s overall $7.7 billion operating profit in the second quarter. AWS’ operating margins have historically been far wider than those at Google Cloud.
“Let’s rewind the clock even before ChatGPT. It’s not like after that happened, suddenly we hurried and came up with a plan because you can’t engineer a chip in that quick a time, let alone you can’t build a Bedrock service in a matter of 2 to 3 months,” said Swami Sivasubramanian, AWS’ VP of database, analytics and machine learning.
Bedrock gives AWS customers access to large language models made by Anthropic, Stability AI, AI21 Labs and Amazon’s own Titan.
“We don’t believe that one model is going to rule the world, and we want our customers to have the state-of-the-art models from multiple providers because they are going to pick the right tool for the right job,” Sivasubramanian said.
An Amazon employee works on custom AI chips, in a jacket branded with AWS’ chip Inferentia, at the AWS chip lab in Austin, Texas, on July 25, 2023.
Katie Tarasov
One of Amazon’s newest AI offerings is AWS HealthScribe, a service unveiled in July to help doctors draft patient visit summaries using generative AI. Amazon also has SageMaker, a machine learning hub that offers algorithms, models and more.
Another big tool is coding companion CodeWhisperer, which Amazon said has enabled developers to complete tasks 57% faster on average. Last year, Microsoft also reported productivity boosts from its coding companion, GitHub Copilot.
“We have so many customers who are saying, ‘I want to do generative AI,’ but they don’t necessarily know what that means for them in the context of their own businesses. And so we’re going to bring in solutions architects and engineers and strategists and data scientists to work with them one on one,” AWS CEO Selipsky said.
Although so far AWS has focused largely on tools instead of building a competitor to ChatGPT, a recently leaked internal email shows Amazon CEO Andy Jassy is directly overseeing a new central team building out expansive large language models, too.
In the second-quarter earnings call, Jassy said a “very significant amount” of AWS business is now driven by AI and more than 20 machine learning services it offers. Some examples of customers include Philips, 3M, Old Mutual and HSBC.
The explosive growth in AI has come with a flurry of security concerns from companies worried that employees are putting proprietary information into the training data used by public large language models.
“I can’t tell you how many Fortune 500 companies I’ve talked to who have banned ChatGPT. So with our approach to generative AI and our Bedrock service, anything you do, any model you use through Bedrock will be in your own isolated virtual private cloud environment. It’ll be encrypted, it’ll have the same AWS access controls,” Selipsky said.
For now, Amazon is only accelerating its push into generative AI, telling CNBC that “over 100,000” customers are using machine learning on AWS today. Although that’s a small percentage of AWS’s millions of customers, analysts say that could change.
“What we are not seeing is enterprises saying, ‘Oh, wait a minute, Microsoft is so ahead in generative AI, let’s just go out and let’s switch our infrastructure strategies, migrate everything to Microsoft.’ Dekate said. “If you’re already an Amazon customer, chances are you’re likely going to explore Amazon ecosystems quite extensively.”
— CNBC’s Jordan Novet contributed to this report.
CORRECTION: This article has been updated to reflect Inferentia as the chip used for machine learning inference.
Paxton sued Google in 2022 for allegedly unlawfully tracking and collecting the private data of users.
The attorney general said the settlement, which covers allegations in two separate lawsuits against the search engine and app giant, dwarfed all past settlements by other states with Google for similar data privacy violations.
Google’s settlement comes nearly 10 months after Paxton obtained a $1.4 billion settlement for Texas from Meta, the parent company of Facebook and Instagram, to resolve claims of unauthorized use of biometric data by users of those popular social media platforms.
“In Texas, Big Tech is not above the law,” Paxton said in a statement on Friday.
“For years, Google secretly tracked people’s movements, private searches, and even their voiceprints and facial geometry through their products and services. I fought back and won,” said Paxton.
“This $1.375 billion settlement is a major win for Texans’ privacy and tells companies that they will pay for abusing our trust.”
Google spokesman Jose Castaneda said the company did not admit any wrongdoing or liability in the settlement, which involves allegations related to the Chrome browser’s incognito setting, disclosures related to location history on the Google Maps app, and biometric claims related to Google Photo.
Castaneda said Google does not have to make any changes to products in connection with the settlement and that all of the policy changes that the company made in connection with the allegations were previously announced or implemented.
“This settles a raft of old claims, many of which have already been resolved elsewhere, concerning product policies we have long since changed,” Castaneda said.
“We are pleased to put them behind us, and we will continue to build robust privacy controls into our services.”
Virtual care company Omada Health filed for an IPO on Friday, the latest digital health company that’s signaled its intent to hit the public markets despite a turbulent economy.
Founded in 2012, Omada offers virtual care programs to support patients with chronic conditions like prediabetes, diabetes and hypertension. The company describes its approach as a “between-visit care model” that is complementary to the broader health-care ecosystem, according to its prospectus.
Revenue increased 57% in the first quarter to $55 million, up from $35.1 million during the same period last year, the filing said. The San Francisco-based company generated $169.8 million in revenue during 2024, up 38% from $122.8 million the previous year.
Omada’s net loss narrowed to $9.4 million during its first quarter from $19 million during the same period last year. It reported a net loss of $47.1 million in 2024, compared to a $67.5 million net loss during 2023.
The IPO market has been largely dormant across the tech sector for the past three years, and within digital health, it’s been almost completely dead. After President Donald Trump announced a sweeping tariff policy that plunged U.S. markets into turmoil last month, taking a company public is an even riskier endeavor. Online lender Klarna delayed its long-anticipated IPO, as did ticket marketplace StubHub.
But Omada Health isn’t the first digital health company to file for its public market debut this year. Virtual physical therapy startup Hinge Health filed its prospectus in March, and provided an update with its first-quarter earnings on Monday, a signal to investors that it’s looking to forge ahead.
Omada contracts with employers, and the company said it works with more than 2,000 customers and supports 679,000 members as of March 31. More than 156 million Americans suffer from at least one chronic condition, so there is a significant market opportunity, according to the company’s filing.
In 2022, Omada announced a $192 million funding round that pushed its valuation above $1 billion. U.S. Venture Partners, Andreessen Horowitz and Fidelity’s FMR LLC are the largest outside shareholders in the company, each owning between 9% and 10% of the stock.
“To our prospective shareholders, thank you for learning more about Omada. I invite you join our journey,” Omada co-founder and CEO Sean Duffy said in the filing. “In front of us is a unique chance to build a promising and successful business while truly changing lives.”
Liz Reid, vice president, search, Google speaks during an event in New Delhi on December 19, 2022.
Sajjad Hussain | AFP | Getty Images
Testimony in Google‘s antitrust search remedies trial that wrapped hearings Friday shows how the company is calculating possible changes proposed by the Department of Justice.
Google head of search Liz Reid testified in court Tuesday that the company would need to divert between 1,000 and 2,000 employees, roughly 20% of Google’s search organization, to carry out some of the proposed remedies, a source with knowledge of the proceedings confirmed.
The testimony comes during the final days of the remedies trial, which will determine what penalties should be taken against Google after a judge last year ruled the company has held an illegal monopoly in its core market of internet search.
The DOJ, which filed the original antitrust suit and proposed remedies, asked the judge to force Google to share its data used for generating search results, such as click data. It also asked for the company to remove the use of “compelled syndication,” which refers to the practice of making certain deals with companies to ensure its search engine remains the default choice in browsers and smartphones.
Read more CNBC tech news
Google pays Apple billions of dollars per year to be the default search engine on iPhones. It’s lucrative for Apple and a valuable way for Google to get more search volume and users.
Apple’s SVP of Services Eddy Cue testified Wednesday that Apple chooses to feature Google because it’s “the best search engine.”
The DOJ also proposed the company divest its Chrome browser but that was not included in Reid’s initial calculation, the source confirmed.
Reid on Tuesday said Google’s proprietary “Knowledge Graph” database, which it uses to surface search results, contains more than 500 billion facts, according to the source, and that Google has invested more than $20 billion in engineering costs and content acquisition over more than a decade.
“People ask Google questions they wouldn’t ask anyone else,” she said, according to the source.
Reid echoed Google’s argument that sharing its data would create privacy risks, the source confirmed.
Closing arguments for the search remedies trial will take place May 29th and 30th, followed by the judge’s decision expected in August.
The company faces a separate remedies trial for its advertising tech business, which is scheduled to begin Sept. 22.