Connect with us

Published

on

In an unmarked office building in Austin, Texas, two small rooms contain a handful of Amazon employees designing two types of microchips for training and accelerating generative AI. These custom chips, Inferentia and Trainium, offer AWS customers an alternative to training their large language models on Nvidia GPUs, which have been getting difficult and expensive to procure. 

“The entire world would like more chips for doing generative AI, whether that’s GPUs or whether that’s Amazon’s own chips that we’re designing,” Amazon Web Services CEO Adam Selipsky told CNBC in an interview in June. “I think that we’re in a better position than anybody else on Earth to supply the capacity that our customers collectively are going to want.”

Yet others have acted faster, and invested more, to capture business from the generative AI boom. When OpenAI launched ChatGPT in November, Microsoft gained widespread attention for hosting the viral chatbot, and investing a reported $13 billion in OpenAI. It was quick to add the generative AI models to its own products, incorporating them into Bing in February. 

That same month, Google launched its own large language model, Bard, followed by a $300 million investment in OpenAI rival Anthropic. 

It wasn’t until April that Amazon announced its own family of large language models, called Titan, along with a service called Bedrock to help developers enhance software using generative AI.

“Amazon is not used to chasing markets. Amazon is used to creating markets. And I think for the first time in a long time, they are finding themselves on the back foot and they are working to play catch up,” said Chirag Dekate, VP analyst at Gartner.

Meta also recently released its own LLM, Llama 2. The open-source ChatGPT rival is now available for people to test on Microsoft‘s Azure public cloud.

Chips as ‘true differentiation’

In the long run, Dekate said, Amazon’s custom silicon could give it an edge in generative AI. 

“I think the true differentiation is the technical capabilities that they’re bringing to bear,” he said. “Because guess what? Microsoft does not have Trainium or Inferentia,” he said.

AWS quietly started production of custom silicon back in 2013 with a piece of specialized hardware called Nitro. It’s now the highest-volume AWS chip. Amazon told CNBC there is at least one in every AWS server, with a total of more than 20 million in use. 

AWS started production of custom silicon back in 2013 with this piece of specialized hardware called Nitro. Amazon told CNBC in August that Nitro is now the highest volume AWS chip, with at least one in every AWS server and a total of more than 20 million in use.

Courtesy Amazon

In 2015, Amazon bought Israeli chip startup Annapurna Labs. Then in 2018, Amazon launched its Arm-based server chip, Graviton, a rival to x86 CPUs from giants like AMD and Intel.

“Probably high single-digit to maybe 10% of total server sales are Arm, and a good chunk of those are going to be Amazon. So on the CPU side, they’ve done quite well,” said Stacy Rasgon, senior analyst at Bernstein Research.

Also in 2018, Amazon launched its AI-focused chips. That came two years after Google announced its first Tensor Processor Unit, or TPU. Microsoft has yet to announce the Athena AI chip it’s been working on, reportedly in partnership with AMD

CNBC got a behind-the-scenes tour of Amazon’s chip lab in Austin, Texas, where Trainium and Inferentia are developed and tested. VP of product Matt Wood explained what both chips are for.

“Machine learning breaks down into these two different stages. So you train the machine learning models and then you run inference against those trained models,” Wood said. “Trainium provides about 50% improvement in terms of price performance relative to any other way of training machine learning models on AWS.”

Trainium first came on the market in 2021, following the 2019 release of Inferentia, which is now on its second generation.

Inferentia allows customers “to deliver very, very low-cost, high-throughput, low-latency, machine learning inference, which is all the predictions of when you type in a prompt into your generative AI model, that’s where all that gets processed to give you the response, ” Wood said.

For now, however, Nvidia’s GPUs are still king when it comes to training models. In July, AWS launched new AI acceleration hardware powered by Nvidia H100s. 

“Nvidia chips have a massive software ecosystem that’s been built up around them over the last like 15 years that nobody else has,” Rasgon said. “The big winner from AI right now is Nvidia.”

Amazon’s custom chips, from left to right, Inferentia, Trainium and Graviton are shown at Amazon’s Seattle headquarters on July 13, 2023.

Joseph Huerta

Leveraging cloud dominance

AWS’ cloud dominance, however, is a big differentiator for Amazon.

“Amazon does not need to win headlines. Amazon already has a really strong cloud install base. All they need to do is to figure out how to enable their existing customers to expand into value creation motions using generative AI,” Dekate said.

When choosing between Amazon, Google, and Microsoft for generative AI, there are millions of AWS customers who may be drawn to Amazon because they’re already familiar with it, running other applications and storing their data there.

“It’s a question of velocity. How quickly can these companies move to develop these generative AI applications is driven by starting first on the data they have in AWS and using compute and machine learning tools that we provide,” explained Mai-Lan Tomsen Bukovec, VP of technology at AWS.

AWS is the world’s biggest cloud computing provider, with 40% of the market share in 2022, according to technology industry researcher Gartner. Although operating income has been down year-over-year for three quarters in a row, AWS still accounted for 70% of Amazon’s overall $7.7 billion operating profit in the second quarter. AWS’ operating margins have historically been far wider than those at Google Cloud.

AWS also has a growing portfolio of developer tools focused on generative AI.

“Let’s rewind the clock even before ChatGPT. It’s not like after that happened, suddenly we hurried and came up with a plan because you can’t engineer a chip in that quick a time, let alone you can’t build a Bedrock service in a matter of 2 to 3 months,” said Swami Sivasubramanian, AWS’ VP of database, analytics and machine learning.

Bedrock gives AWS customers access to large language models made by Anthropic, Stability AI, AI21 Labs and Amazon’s own Titan.

“We don’t believe that one model is going to rule the world, and we want our customers to have the state-of-the-art models from multiple providers because they are going to pick the right tool for the right job,” Sivasubramanian said.

An Amazon employee works on custom AI chips, in a jacket branded with AWS’ chip Inferentia, at the AWS chip lab in Austin, Texas, on July 25, 2023.

Katie Tarasov

One of Amazon’s newest AI offerings is AWS HealthScribe, a service unveiled in July to help doctors draft patient visit summaries using generative AI. Amazon also has SageMaker, a machine learning hub that offers algorithms, models and more. 

Another big tool is coding companion CodeWhisperer, which Amazon said has enabled developers to complete tasks 57% faster on average. Last year, Microsoft also reported productivity boosts from its coding companion, GitHub Copilot. 

In June, AWS announced a $100 million generative AI innovation “center.” 

“We have so many customers who are saying, ‘I want to do generative AI,’ but they don’t necessarily know what that means for them in the context of their own businesses. And so we’re going to bring in solutions architects and engineers and strategists and data scientists to work with them one on one,” AWS CEO Selipsky said.

Although so far AWS has focused largely on tools instead of building a competitor to ChatGPT, a recently leaked internal email shows Amazon CEO Andy Jassy is directly overseeing a new central team building out expansive large language models, too.

In the second-quarter earnings call, Jassy said a “very significant amount” of AWS business is now driven by AI and more than 20 machine learning services it offers. Some examples of customers include Philips, 3M, Old Mutual and HSBC. 

The explosive growth in AI has come with a flurry of security concerns from companies worried that employees are putting proprietary information into the training data used by public large language models.

“I can’t tell you how many Fortune 500 companies I’ve talked to who have banned ChatGPT. So with our approach to generative AI and our Bedrock service, anything you do, any model you use through Bedrock will be in your own isolated virtual private cloud environment. It’ll be encrypted, it’ll have the same AWS access controls,” Selipsky said.

For now, Amazon is only accelerating its push into generative AI, telling CNBC that “over 100,000” customers are using machine learning on AWS today. Although that’s a small percentage of AWS’s millions of customers, analysts say that could change.

“What we are not seeing is enterprises saying, ‘Oh, wait a minute, Microsoft is so ahead in generative AI, let’s just go out and let’s switch our infrastructure strategies, migrate everything to Microsoft.’ Dekate said. “If you’re already an Amazon customer, chances are you’re likely going to explore Amazon ecosystems quite extensively.”

— CNBC’s Jordan Novet contributed to this report.

CORRECTION: This article has been updated to reflect Inferentia as the chip used for machine learning inference.

Continue Reading

Technology

How Elon Musk’s plan to slash government agencies and regulation may benefit his empire

Published

on

By

How Elon Musk’s plan to slash government agencies and regulation may benefit his empire

Elon Musk’s business empire is sprawling. It includes electric vehicle maker Tesla, social media company X, artificial intelligence startup xAI, computer interface company Neuralink, tunneling venture Boring Company and aerospace firm SpaceX. 

Some of his ventures already benefit tremendously from federal contracts. SpaceX has received more than $19 billion from contracts with the federal government, according to research from FedScout. Under a second Trump presidency, more lucrative contracts could come its way. SpaceX is on track to take in billions of dollars annually from prime contracts with the federal government for years to come, according to FedScout CEO Geoff Orazem.

Musk, who has frequently blamed the government for stifling innovation, could also push for less regulation of his businesses. Earlier this month, Musk and former Republican presidential candidate Vivek Ramaswamy were tapped by Trump to lead a government efficiency group called the Department of Government Efficiency, or DOGE.

In a recent commentary piece in the Wall Street Journal, Musk and Ramaswamy wrote that DOGE will “pursue three major kinds of reform: regulatory rescissions, administrative reductions and cost savings.” They went on to say that many existing federal regulations were never passed by Congress and should therefore be nullified, which President-elect Trump could accomplish through executive action. Musk and Ramaswamy also championed the large-scale auditing of agencies, calling out the Pentagon for failing its seventh consecutive audit. 

“The number one way Elon Musk and his companies would benefit from a Trump administration is through deregulation and defanging, you know, giving fewer resources to federal agencies tasked with oversight of him and his businesses,” says CNBC technology reporter Lora Kolodny.

To learn how else Elon Musk and his companies may benefit from having the ear of the president-elect watch the video.

Continue Reading

Technology

Why X’s new terms of service are driving some users to leave Elon Musk’s platform

Published

on

By

Why X's new terms of service are driving some users to leave Elon Musk's platform

Elon Musk attends the America First Policy Institute gala at Mar-A-Lago in Palm Beach, Florida, Nov. 14, 2024.

Carlos Barria | Reuters

X’s new terms of service, which took effect Nov. 15, are driving some users off Elon Musk’s microblogging platform. 

The new terms include expansive permissions requiring users to allow the company to use their data to train X’s artificial intelligence models while also making users liable for as much as $15,000 in damages if they use the platform too much. 

The terms are prompting some longtime users of the service, both celebrities and everyday people, to post that they are taking their content to other platforms. 

“With the recent and upcoming changes to the terms of service — and the return of volatile figures — I find myself at a crossroads, facing a direction I can no longer fully support,” actress Gabrielle Union posted on X the same day the new terms took effect, while announcing she would be leaving the platform.

“I’m going to start winding down my Twitter account,” a user with the handle @mplsFietser said in a post. “The changes to the terms of service are the final nail in the coffin for me.”

It’s unclear just how many users have left X due specifically to the company’s new terms of service, but since the start of November, many social media users have flocked to Bluesky, a microblogging startup whose origins stem from Twitter, the former name for X. Some users with new Bluesky accounts have posted that they moved to the service due to Musk and his support for President-elect Donald Trump.

Bluesky’s U.S. mobile app downloads have skyrocketed 651% since the start of November, according to estimates from Sensor Tower. In the same period, X and Meta’s Threads are up 20% and 42%, respectively. 

X and Threads have much larger monthly user bases. Although Musk said in May that X has 600 million monthly users, market intelligence firm Sensor Tower estimates X had 318 million monthly users as of October. That same month, Meta said Threads had nearly 275 million monthly users. Bluesky told CNBC on Thursday it had reached 21 million total users this week.

Here are some of the noteworthy changes in X’s new service terms and how they compare with those of rivals Bluesky and Threads.

Artificial intelligence training

X has come under heightened scrutiny because of its new terms, which say that any content on the service can be used royalty-free to train the company’s artificial intelligence large language models, including its Grok chatbot.

“You agree that this license includes the right for us to (i) provide, promote, and improve the Services, including, for example, for use with and training of our machine learning and artificial intelligence models, whether generative or another type,” X’s terms say.

Additionally, any “user interactions, inputs and results” shared with Grok can be used for what it calls “training and fine-tuning purposes,” according to the Grok section of the X app and website. This specific function, though, can be turned off manually. 

X’s terms do not specify whether users’ private messages can be used to train its AI models, and the company did not respond to a request for comment.

“You should only provide Content that you are comfortable sharing with others,” read a portion of X’s terms of service agreement.

Though X’s new terms may be expansive, Meta’s policies aren’t that different. 

The maker of Threads uses “information shared on Meta’s Products and services” to get its training data, according to the company’s Privacy Center. This includes “posts or photos and their captions.” There is also no direct way for users outside of the European Union to opt out of Meta’s AI training. Meta keeps training data “for as long as we need it on a case-by-case basis to ensure an AI model is operating appropriately, safely and efficiently,” according to its Privacy Center. 

Under Meta’s policy, private messages with friends or family aren’t used to train AI unless one of the users in a chat chooses to share it with the models, which can include Meta AI and AI Studio.

Bluesky, which has seen a user growth surge since Election Day, doesn’t do any generative AI training. 

“We do not use any of your content to train generative AI, and have no intention of doing so,” Bluesky said in a post on its platform Friday, confirming the same to CNBC as well.

Liquidated damages

Bluesky CEO: Our platform is 'radically different' from anything else in social media

Continue Reading

Technology

The Pentagon’s battle inside the U.S. for control of a new Cyber Force

Published

on

By

The Pentagon's battle inside the U.S. for control of a new Cyber Force

A recent Chinese cyber-espionage attack inside the nation’s major telecom networks that may have reached as high as the communications of President-elect Donald Trump and Vice President-elect J.D. Vance was designated this week by one U.S. senator as “far and away the most serious telecom hack in our history.”

The U.S. has yet to figure out the full scope of what China accomplished, and whether or not its spies are still inside U.S. communication networks.

“The barn door is still wide open, or mostly open,” Senator Mark Warner of Virginia and chairman of the Senate Intelligence Committee told the New York Times on Thursday.

The revelations highlight the rising cyberthreats tied to geopolitics and nation-state actor rivals of the U.S., but inside the federal government, there’s disagreement on how to fight back, with some advocates calling for the creation of an independent federal U.S. Cyber Force. In September, the Department of Defense formally appealed to Congress, urging lawmakers to reject that approach.

Among one of the most prominent voices advocating for the new branch is the Foundation for Defense of Democracies, a national security think tank, but the issue extends far beyond any single group. In June, defense committees in both the House and Senate approved measures calling for independent evaluations of the feasibility to create a separate cyber branch, as part of the annual defense policy deliberations.

Drawing on insights from more than 75 active-duty and retired military officers experienced in cyber operations, the FDD’s 40-page report highlights what it says are chronic structural issues within the U.S. Cyber Command (CYBERCOM), including fragmented recruitment and training practices across the Army, Navy, Air Force, and Marines.

“America’s cyber force generation system is clearly broken,” the FDD wrote, citing comments made in 2023 by then-leader of U.S. Cyber Command, Army General Paul Nakasone, who took over the role in 2018 and described current U.S. military cyber organization as unsustainable: “All options are on the table, except the status quo,” Nakasone had said.

Concern with Congress and a changing White House

The FDD analysis points to “deep concerns” that have existed within Congress for a decade — among members of both parties — about the military being able to staff up to successfully defend cyberspace. Talent shortages, inconsistent training, and misaligned missions, are undermining CYBERCOM’s capacity to respond effectively to complex cyber threats, it says. Creating a dedicated branch, proponents argue, would better position the U.S. in cyberspace. The Pentagon, however, warns that such a move could disrupt coordination, increase fragmentation, and ultimately weaken U.S. cyber readiness.

As the Pentagon doubles down on its resistance to establishment of a separate U.S. Cyber Force, the incoming Trump administration could play a significant role in shaping whether America leans toward a centralized cyber strategy or reinforces the current integrated framework that emphasizes cross-branch coordination.

Known for his assertive national security measures, Trump’s 2018 National Cyber Strategy emphasized embedding cyber capabilities across all elements of national power and focusing on cross-departmental coordination and public-private partnerships rather than creating a standalone cyber entity. At that time, the Trump’s administration emphasized centralizing civilian cybersecurity efforts under the Department of Homeland Security while tasking the Department of Defense with addressing more complex, defense-specific cyber threats. Trump’s pick for Secretary of Homeland Security, South Dakota Governor Kristi Noem, has talked up her, and her state’s, focus on cybersecurity.

Former Trump officials believe that a second Trump administration will take an aggressive stance on national security, fill gaps at the Energy Department, and reduce regulatory burdens on the private sector. They anticipate a stronger focus on offensive cyber operations, tailored threat vulnerability protection, and greater coordination between state and local governments. Changes will be coming at the top of the Cybersecurity and Infrastructure Security Agency, which was created during Trump’s first term and where current director Jen Easterly has announced she will leave once Trump is inaugurated.

Cyber Command 2.0 and the U.S. military

John Cohen, executive director of the Program for Countering Hybrid Threats at the Center for Internet Security, is among those who share the Pentagon’s concerns. “We can no longer afford to operate in stovepipes,” Cohen said, warning that a separate cyber branch could worsen existing silos and further isolate cyber operations from other critical military efforts.

Cohen emphasized that adversaries like China and Russia employ cyber tactics as part of broader, integrated strategies that include economic, physical, and psychological components. To counter such threats, he argued, the U.S. needs a cohesive approach across its military branches. “Confronting that requires our military to adapt to the changing battlespace in a consistent way,” he said.

In 2018, CYBERCOM certified its Cyber Mission Force teams as fully staffed, but concerns have been expressed by the FDD and others that personnel were shifted between teams to meet staffing goals — a move they say masked deeper structural problems. Nakasone has called for a CYBERCOM 2.0, saying in comments early this year “How do we think about training differently? How do we think about personnel differently?” and adding that a major issue has been the approach to military staffing within the command.

Austin Berglas, a former head of the FBI’s cyber program in New York who worked on consolidation efforts inside the Bureau, believes a separate cyber force could enhance U.S. capabilities by centralizing resources and priorities. “When I first took over the [FBI] cyber program … the assets were scattered,” said Berglas, who is now the global head of professional services at supply chain cyber defense company BlueVoyant. Centralization brought focus and efficiency to the FBI’s cyber efforts, he said, and it’s a model he believes would benefit the military’s cyber efforts as well. “Cyber is a different beast,” Berglas said, emphasizing the need for specialized training, advancement, and resource allocation that isn’t diluted by competing military priorities.

Berglas also pointed to the ongoing “cyber arms race” with adversaries like China, Russia, Iran, and North Korea. He warned that without a dedicated force, the U.S. risks falling behind as these nations expand their offensive cyber capabilities and exploit vulnerabilities across critical infrastructure.

Nakasone said in his comments earlier this year that a lot has changed since 2013 when U.S. Cyber Command began building out its Cyber Mission Force to combat issues like counterterrorism and financial cybercrime coming from Iran. “Completely different world in which we live in today,” he said, citing the threats from China and Russia.

Brandon Wales, a former executive director of the CISA, said there is the need to bolster U.S. cyber capabilities, but he cautions against major structural changes during a period of heightened global threats.

“A reorganization of this scale is obviously going to be disruptive and will take time,” said Wales, who is now vice president of cybersecurity strategy at SentinelOne.

He cited China’s preparations for a potential conflict over Taiwan as a reason the U.S. military needs to maintain readiness. Rather than creating a new branch, Wales supports initiatives like Cyber Command 2.0 and its aim to enhance coordination and capabilities within the existing structure. “Large reorganizations should always be the last resort because of how disruptive they are,” he said.

Wales says it’s important to ensure any structural changes do not undermine integration across military branches and recognize that coordination across existing branches is critical to addressing the complex, multidomain threats posed by U.S. adversaries. “You should not always assume that centralization solves all of your problems,” he said. “We need to enhance our capabilities, both defensively and offensively. This isn’t about one solution; it’s about ensuring we can quickly see, stop, disrupt, and prevent threats from hitting our critical infrastructure and systems,” he added.

Continue Reading

Trending