Connect with us

Published

on

Mark Zuckerberg arrives before the inauguration of Donald Trump as the 47th president of the United States takes place inside the Capitol Rotunda of the U.S. Capitol building in Washington, D.C., Monday, Jan. 20, 2025.

Kenny Holston | Via Reuters

Mark Zuckerberg is so frustrated with Meta’s standing in artificial intelligence that he’s willing to spend billions of dollars to convince Scale AI CEO Alexandr Wang to join his company, people familiar with the matter told CNBC. 

Meta is finalizing a deal to invest $14 billion into Scale AI, according to a person familiar with the matter who asked not to be named because the terms are confidential. Bloomberg reported earlier this week that an investment could top $10 billion, and a story from The Information on Tuesday said Meta would pay close to $15 billion.

As a founder of one of the most prominent AI startups, Wang has built a reputation as an ambitious leader who both understands AI’s technical complexities and how to build a business that’s not merely focused on research, according to two former Meta AI employees who agreed to speak on the condition of anonymity. Zuckerberg will be counting on Wang to better execute Meta’s AI ambitions following the lukewarm launch of the company’s latest Llama AI models.

By not directly acquiring Scale AI, Meta appears to be taking a similar strategy as companies like Google and Microsoft, which have brought in prominent leaders in AI from the startups Character.AI and Inflection AI by taking large stakes in those companies rather than buying them outright. Meta is currently on trial against the Federal Trade Commission for antitrust claims, and the company doesn’t want to further upset regulators by acquiring Scale AI, multiple people familiar with the matter said.

As part of the deal, Meta will take a 49% stake in the data-labelling and annotation startup, The Information reported, while Wang will help lead a new AI research lab at the social networking company and will be joined by some of his colleagues. The New York Times was first to report about the new AI lab.

Alexandr Wang, CEO of ScaleAI speaks on CNBC’s Squawk Box outside the World Economic Forum in Davos, Switzerland on Jan. 23, 2025.

Gerry Miller | CNBC

Scale AI, founded in 2016, has made a splash in the era of generative AI by helping major tech companies like OpenAI, Google and Microsoft prepare data they use to train cutting-edge AI models. Meta is one of Scale AI’s biggest customers, according to two people familiar with the matter.

The startup, valued in a funding round about a year ago at $14 billion, is number 28 on CNBC’s Disruptor 50 list. In mid-2024, the company signed one of the biggest recent commercial leases in San Francisco, gobbling up about 180,000 square feet of space in a downtown building that had been occupied by Airbnb.

Scale AI has increasingly made in-roads into the defense industry, and in March announced a multimillion dollar deal with the Department of Defense. In November, it collaborated with Meta on Defense Llama, a custom version of Meta’s open-source Llama foundation model designed specifically to “support American national security missions,” the company said in a blog post.

Meta and Scale AI declined to comment.

Meta’s AI challenges

Heading into 2025, AI was one of Meta’s top priorities. But Zuckerberg has grown agitated that rivals like OpenAI appear to be ahead in both underlying AI models and consumer-facing apps, current and former Meta employees said.

Zuckerberg has been deprioritizing its Fundamental Artificial Intelligence Research unit, or FAIR, in favor of its more product-oriented GenAI team to help Meta make headway in AI and improve its Llama family of AI models, CNBC previously reported.

Meta’s release of its Llama 4 AI models in April was not well received by developers, further frustrating Zuckerberg, the people said. At the time, Meta only released two smaller versions of Llama 4 and said it would eventually release a bigger and more powerful “Behemoth” model. 

That model has yet to be made available due to Zuckerberg’s concerns about its capabilities relative to competing models, the people said. In particular, there is concern about how Behemoth stacks up against the latest from companies like OpenAI and China’s DeepSeek, whose models are preferred by the wider developer community.

Following Llama 4’s lackluster debut, Meta conducted a reorganization of its GenAI unit, splitting it into two. Connor Hayes, a longstanding Meta employee, was put in charge of AI Products, while AGI Foundations was given to Amir Frenkel, previously a vice president of engineering and product for Meta’s Reality Labs hardware unit, and Ahmad Al-Dahle, the previous head of GenAI. 

Al-Dahle’s new position as a co-leader was seen as a sign that Zuckerberg had lost confidence in him, the people said.

Ahmad Al-Dahle, VP and Head of GenAI at Meta.

Courtesy: Meta

Zuckerberg admires Wang and considers him capable of a major role at Meta as an AI leader, the people said. A dropout from the Massachusetts Institute of Technology, Wang has built a sizable business and is familiar with AI’s technical intricacies. The people described Wang as a “wartime CEO” who is in line with Zuckerberg’s position that the U.S. faces increasing competition from China, thus requiring help from the tech industry.

Wang told CNBC in January that he believes there is an “AI war” between the U.S. and China, and that the U.S. will need more computing power in order to compete.

“The United States is going to need a huge amount of computational capacity, a huge amount of infrastructure,” Wang said at the time. “We need to unleash U.S. energy to enable this AI boom.”

It’s an unusual move for Zuckerberg, who has traditionally put loyalists in high-ranking positions. But it shows the magnitude of the moment and Zuckerberg’s belief that a prominent outsider like Wang may be better positioned than any current Meta employee to bolster the company’s position in AI, the people said.

Wang also brings a lot of outside knowledge of how competitors like OpenAI are building their consumer chatbots and AI models. Data labelling and training has become more complicated in recent years as the capabilities of AI models has increased, said Vahan Petrosyan, the CEO of SuperAnnotate, one of Scale AI’s competitors.

“I would say Scale have covered probably 70% of all the models that are built,” Petrosyan said. With Wang and others from Scale AI, Meta could gain “collective intelligence on how to build a better ChatGPT.”

“When Meta is buying them, they’re buying their intelligence,” Petrosyan said. 

WATCH: Mark Zuckerberg lobbies Trump to avoid Meta antitrust trial

Mark Zuckerberg lobbies Trump to avoid Meta antitrust trial, reports say

Continue Reading

Technology

China’s DeepSeek launches next-gen AI model. Here’s what makes it different

Published

on

By

China's DeepSeek launches next-gen AI model. Here's what makes it different

Anna Barclay | Getty Images News | Getty Images

Chinese startup DeepSeek’s latest experimental model promises to increase efficiency and improve AI’s ability to handle a lot of information at a fraction of the cost, but questions remain over how effective and safe the architecture is.  

DeepSeek sent Silicon Valley into a frenzy when it launched its first model R1 out of nowhere last year, showing that it’s possible to train large language models (LLMs) quickly, on less powerful chips, using fewer resources.

The company released DeepSeek-V3.2-Exp on Monday, an experimental version of its current model DeepSeek-V3.1-Terminus, which builds further on its mission to increase efficiency in AI systems, according to a post on the AI forum Hugging Face.

“DeepSeek V3.2 continues the focus on efficiency, cost reduction, and open-source sharing,” Adina Yakefu, Chinese community lead at Hugging Face, told CNBC. “The big improvement is a new feature called DSA (DeepSeek Sparse Attention), which makes the AI better at handling long documents and conversations. It also cuts the cost of running the AI in half compared to the previous version.”

“It’s significant because it should make the model faster and more cost-effective to use without a noticeable drop in performance,” said Nick Patience, vice president and practice lead for AI at The Futurum Group. “This makes powerful AI more accessible to developers, researchers, and smaller companies, potentially leading to a wave of new and innovative applications.”

The pros and cons of sparse attention 

An AI model makes decisions based on its training data and new information, such as a prompt. Say an airline wants to find the best route from A to B, while there are many options, not all are feasible. By filtering out the less viable routes, you dramatically reduce the amount of time, fuel and, ultimately, money, needed to make the journey. That is exactly sparse attention does, it only factors in data that it thinks is important given the task at hand, as opposed to other models thus far which have crunched all data in the model.

“So basically, you cut out things that you think are not important,” said Ekaterina Almasque, the cofounder and managing partner of new venture capital fund BlankPage Capital.

Sparse attention is a boon for efficiency and the ability to scale AI given fewer resources are needed, but one concern is that it could lead to a drop in how reliable models are due to the lack of oversight in how and why it discounts information.

“The reality is, they [sparse attention models] have lost a lot of nuances,” said Almasque, who was an early supporter of Dataiku and Darktrace, and an investor in Graphcore. “And then the real question is, did they have the right mechanism to exclude not important data, or is there a mechanism excluding really important data, and then the outcome will be much less relevant?”

This could be particularly problematic for AI safety and inclusivity, the investor noted, adding that it may not be “the optimal one or the safest” AI model to use compared with competitors or traditional architectures. 

DeepSeek, however, says the experimental model works on par with its V3.1-Terminus. Despite speculation of a bubble forming, AI remains at the centre of geopolitical competition with the U.S. and China vying for the winning spot. Yakefu noted that DeepSeek’s models work “right out of the box” with Chinese-made AI chips, such as Ascend and Cambricon, meaning they can run locally on domestic hardware without any extra setup.

Deepseek trains breakthrough R1 model at a fraction of US costs

DeepSeek also shared the actual programming code and tools needed to use the experimental model, she said. “This means other people can learn from it and build their own improvements.”

But for Almasque, the very nature of this means the tech may not be defensible. “The approach is not super new,” she said, noting the industry has been “talking about sparse models since 2015” and that DeepSeek is not able to patent its technology due to being open source. DeepSeek’s competitive edge, therefore, must lie in how it decides what information to include, she added.

The company itself acknowledges V3.2-Exp is an “intermediate step toward our next-generation architecture,” per the Hugging Face post.

As Patience pointed out, “this is DeepSeek’s value prop all over: efficiency is becoming as important as raw power.”

“DeepSeek is playing the long game to keep the community invested in their progress,” Yakefu added. “People will always go for what is cheap, reliable, and effective.”

Continue Reading

Technology

U.S. Commerce head Lutnick wants Taiwan to help America make 50% of its chips locally

Published

on

By

U.S. Commerce head Lutnick wants Taiwan to help America make 50% of its chips locally

A logo of the Taiwan Semiconductor Manufacturing Company (TSMC) displayed on a smartphone screen

Vcg | Visual China Group | Getty Images

The Trump administration is pushing Taipei to shift investment and chip production to the U.S. so that half of America’s chips are manufactured domestically, in a move that could have implications for Taiwan’s national defense. 

Washington has held discussions with Taipei about the “50-50” split in semiconductor production, which would significantly reduce American dependence on Taiwan, U.S. Secretary of Commerce Howard Lutnick told News Nation in an interview released over the weekend. 

Taiwan is said to produce over 90% of the world’s advanced semiconductors, which, according to Lutnick, is cause for concern due to the island nation’s distance from the U.S. and proximity to China. 

“My objective, and this administration’s objective, is to get chip manufacturing significantly onshored — we need to make our own chips,” Lutnick said. “The idea that I pitched [Taiwan] was, let’s get to 50-50. We’re producing half, and you’re producing half.” 

Lutnick’s goal is to reach about 40% domestic semiconductor production by the end of U.S. President Donald Trump’s current term, which would take northwards of $500 billion in local investments, he said. 

Taiwan’s stronghold on chip production is thanks to Taiwan Semiconductor Manufacturing Co., the world’s largest and most advanced contract chipmaker, which handles production for American tech heavyweights like Nvidia and Apple. 

Taiwan’s critical position in global chips production is believed to have assured the island nation’s defense against direct military action from China, often referred to as the “Silicon Shield” theory.

However, in his News Nation interview, Lutnick downplayed the “Silicon Shield,” and argued that Taiwan would be safer with more balanced chip production between the U.S. and Taiwan.

“My argument to them was, well, if you have 95% [chip production], how am I going to get it to protect you? You’re going to put it on a plane? You’re going to put it on a boat?” Lutnick said. 

Under the 50-50 plan, the U.S. would still be “fundamentally reliant” on Taiwan, but would have the capacity to “do what we need to do, if we need to do it,” he added.

Beijing views the democratically governed island of Taiwan as its own territory and has vowed to reclaim it by force if necessary. Taipei’s current ruling party has rejected and pushed back against such claims. 

This year, the Chinese military has held a number of large-scale exercises off the coast of Taiwan as it tests its military capabilities. During one of China’s military drills in April, Washington reaffirmed its commitment to supporting Taiwan. 

More in return for defense

Lutnick’s statements on the News Nation interview aligned with past comments from Trump, suggesting that the U.S. should get more in return for its defense of the island nation against China. 

Last year, then-presidential candidate Trump had said in an interview that Taiwan should pay the U.S. for defense, and accused the country of “stealing” the United States’ chip business. 

The U.S. was once a leader in the global semiconductor market, but has lost market share due to industry shifts and the emergence of Asian juggernauts like TSMC and Samsung

However, Washington has been working to reverse that trend across multiple administrations. 

TSMC has been building manufacturing facilities in the U.S. since 2020 and has continued to ramp up its investments in the country. It announced intentions to invest an additional $100 billion in March, bringing its total planned investment to $165 billion. 

The Trump administration recently proposed 100% tariffs on semiconductors, but said that companies investing in the U.S. would be exempt. The U.S. and Taiwan also remain in trade negotiations that are likely to impact tariff rates for Taiwanese businesses. 

US still considered a 'check on China' for Taiwan: Former defense minister

Continue Reading

Technology

YouTube agrees to pay Trump $24.5 Million to settle lawsuit over suspended account

Published

on

By

YouTube agrees to pay Trump .5 Million to settle lawsuit over suspended account

U.S. President Donald Trump reacts, as he arrives at Joint Base Andrews, Maryland, U.S., September 26, 2025.

Elizabeth Frantz | Reuters

YouTube has agreed to pay $24.5 million to settle a lawsuit involving the suspension of President Donald Trump’s account following the U.S. Capitol riots on Jan. 6, 2021.

The settlement “shall not constitute an admission of liability or fault,” on behalf of the defendants or related parties, according to a filing on Monday from the U.S. District Court for the Northern District of California.

Trump sued YouTube, Facebook and Twitter in mid-2021, after the companies suspended his accounts on their platforms over concerns related to the incitement of violence.

Since Trump won a second term in November and returned to the White House in January, the tech companies have been settling their disputes with the president. Facebook-parent Meta said in January that it would pay $25 million to settle its lawsuit with Trump. The following month, Elon Musk’s X, formerly Twitter, agreed to settle its Trump-related case for roughly $10 million.

In August, several Democratic senators, including Elizabeth Warren of Massachusetts, sent a letter to Google CEO Sundar Pichai and YouTube CEO Neal Mohan expressing their concern over a possible settlement with the president.

The senators said in the letter that they worried such an action would be part of a “quid-pro-quo arrangement to avoid full accountability for violating federal competition, consumer protection, and labor laws, circumstances that could result in the company running afoul of federal bribery laws.”

WATCH: President Trump signs TikTok deal.

President Trump signs TikTok deal: Here's what to know

Continue Reading

Trending