During last week’s chatbot hype, with Microsoft and Google attempting to outduel each other in showcasing early versions of artificial intelligence-powered search, more than 1 million people signed up to try Microsoft’s tool in the first 48 hours, the company said.
Microsoft CEO Satya Nadella told CNBC that the technology, which can spit out complete answers that read like they were written by a human, was “perhaps the industrial revolution brought to knowledge work.”
But for those concerned about accuracy, the AI leaves plenty to be desired.
In Microsoft’s demo in front of reporters, the ChatGPT-like technology embedded in the company’s Bing search engine analyzed earnings reports from Gap and Lululemon. In comparing its answers to the actual reports, the chatbot missed some numbers. Others appear to have been made up.
“Bing AI got some answers completely wrong during their demo. But no one noticed,” wrote independent search researcher Dmitri Brereton in a Substack post on Monday. “Instead, everyone jumped on the Bing hype train.”
Brereton identified possible factual issues in the Microsoft demo in its responses about vacuum cleaner specifications and travel plans to Mexico in addition to the financial errors. He told CNBC he wasn’t initially looking for errors, and only discovered them when he looked more closely to write a comparison of the AI unveilings from Microsoft and Google.
AI experts call the phenomenon “hallucination,” or the propensity of tools based on large language models to simply make stuff up. Last week, Google introduced a competing AI tool that also included factual errors — although the mistakes were quickly called out by viewers.
Both companies are rushing to incorporate new kinds of generative AI into search engines and are eager to show their advancements following the explosion of ChatGPT, which OpenAI introduced to the public in November. OpenAI has raised billions from Microsoft, while competing startups like Stability AI and Hugging Face also have ballooned to billion-dollar valuations in private funding rounds.
While Google has been reluctant to add AI-generated responses into search engines, citing reputational risk and safety concerns, Microsoft, in its announcement last week, stressed the short-term potential of releasing the technology to some of the public.
“I think it’s important not to be in a lab,” Nadella said. “You have to get these things out safely.”
When it came time to demo Bing AI’s response to a query on corporate earnings, there were some problems.
Yusuf Mehdi, a marketing executive at Microsoft, navigated to Gap’s investor relations site, and asked the Bing AI to summarize the “key takeaways” from the retailer’s third-quarter earnings release in November.
“Very cool. A massive time savings,” Mehdi said.
These are screen shots from Microsoft’s demo:
Kif Leswing/CNBC
Kif Leswing/CNBC
Here are some mistakes in the summary:
Gap’s reported gross margin was 37.4%. But after excluding charges related to Yeezy, the adjusted gross margin was 38.7%.
Gap operating margin was 4.6%, not 5.9%, a number that can’t be found in the company’s report.
Adjusted diluted earnings per share was $0.71 adjusted, instead of $0.42, a number that’s not in the report. The figure Gap reported included an adjusted income tax benefit of about $0.33.
Gap pulled its full-year outlook in August and said in the third-quarter report that “net sales could be down mid-single digits year-over-year in the fourth quarter.” That would imply a decline in revenue for the full year as opposed to “growth in the low double digits.” There is no forecast for operating margin or EPS.
Microsoft said it knows about the errors and that it expects Bing AI to make mistakes.
“We’re aware of this report and have analyzed its findings in our efforts to improve this experience,” a Microsoft spokesperson told CNBC. “We recognize that there is still work to be done and are expecting that the system may make mistakes during this preview period, which is why the feedback is critical so we can learn and help the models get better.”
Microsoft then asked Bing AI to compare Gap’s earnings with Lululemon’s report. Mehdi wanted Bing to pull the information from the two reports into a table.
“Look how amazing this is,” he said. “Just like that, in one table, I can get an answer to this question. Think how much time that would’ve taken otherwise.”
Here’s what the Bing AI tool returned:
Kif Leswing/CNBC
Kif Leswing/CNBC
There are several errors in the table, starting with margins.
Lululemon’s gross margin was 55.9%, not 58.7%.
The company’s operating margin was 19%, not 20.7%.
Lululemon reported diluted EPS of $2, and adjusted EPS of $1.62. Bing showed a diluted EPS number of $1.65.
Gap had $679 million in cash and cash equivalents, not $1.4 billion.
Gap had $3.04 billion in inventory, not $1.9 billion.
In this photo illustration, a man seen holding a smartphone with the logo of US artificial intelligence company Cognition AI Inc. in front of website.
Timon Schneider | SOPA Images | Sipa USA | AP
Artificial intelligence startup Cognition announced it’s acquiring Windsurf, the AI coding company that lost its CEO and several other senior employees to Google just days earlier.
Cognition said on Monday that it will purchase Windsurf’s intellectual property, product, trademark, brand and talent, but didn’t disclose terms of the deal. It’s the latest development in an AI talent war, as companies like Meta, Google and OpenAI fiercely compete for top engineers and researchers.
OpenAI had been in talks to acquire Windsurf for about $3 billion in April, but the deal fell apart, and Google said on Friday that it hired Windsurf’s co-founder and CEO Varun Mohan. Google is paying $2.4 billion in licensing fees and for compensation, as CNBC previously reported.
“Every new employee of Cognition will be treated the same way as existing employees: with transparency, fairness, and deep respect for their abilities and value,” Cognition CEO Scott Wu wrote in a memo to employees on Monday. “After today, our efforts will be as a united and aligned team. There’s only one boat and we’re all in it together.”
Cognition didn’t immediately respond to CNBC’s request for comment. Windsurf directed CNBC to Cognition.
Cognition is best known for its AI coding agent named Devin, which is designed to help engineers build software faster. As of March, the startup had raised hundreds of millions of dollars at a valuation of close to $4 billion, according to a report from Bloomberg.
Both companies are backed by Peter Thiel’s Founders Fund. Other investors in Windsurf include Greenoaks, Kleiner Perkins and General Catalyst.
“I’m overwhelmed with excitement and optimism, but most of all, gratitude,” Jeff Wang, the interim CEO of Windsurf, wrote in a post on X on Monday. “Trying times reveal character, and I couldn’t be prouder of how every single person at Windsurf showed up these last three days for each other and for our users.”
Wu said that the acquisition ensures all Windsurf employees are “treated with respect and well taken care of in this transaction.” All employees will participate financially in the deal, have vesting cliffs waived for their work to date and receive fully accelerated vesting for their, according to the memo.
“There’s never been a more exciting time to build,” Wu wrote.
The Grok logo is being displayed on a smartphone with Xai visible in the background in this photo illustration on April 1, 2024.
Jonathan Raa | Nurphoto | Getty Images
The European Union on Monday called in representatives from Elon Musk‘s xAI after the company’s social network X, and chatbot Grok, generated and spread anti-semitic hate speech, including praise for Adolf Hitler, last week.
A spokesperson for the European Commission told CNBC via e-mail that a technical meeting will take place on Tuesday.
xAI did not immediately respond to a request for comment.
Sandro Gozi, a member of Italy’s parliament and member of the Renew Europe group, last week urged the Commission to hold a formal inquiry.
“The case raises serious concerns about compliance with the Digital Services Act (DSA) as well as the governance of generative AI in the Union’s digital space,” Gozi wrote.
X was already under a Commission probe for possible violations of the DSA.
Read more CNBC tech news
Grok also generated and spread offensive posts about political leaders in Poland and Turkey, including Polish Prime Minister Donald Tusk and Turkish President Recep Erdogan.
Over the weekend, xAI posted a statement apologizing for the hateful content.
“First off, we deeply apologize for the horrific behavior that many experienced. … After careful investigation, we discovered the root cause was an update to a code path upstream of the @grok bot,” the company said in the statement.
Musk and his xAI team launched a new version of Grok Wednesday night amid the backlash. Musk called it “the smartest AI in the world.”
xAI works with other businesses run and largely owned by Musk, including Tesla, the publicly traded automaker, and SpaceX, the U.S. aerospace and defense contractor.
Despite Grok’s recent outburst of hate speech, the U.S. Department of Defense awarded xAI a $200 million contract to develop AI. Anthropic, Google and OpenAI also received AI contracts.
Meta CEO Mark Zuckerberg looks on before the luncheon on the inauguration day of U.S. President Donald Trump’s second presidential term in Washington on Jan. 20, 2025.
Evelyn Hockstein | Reuters
Meta on Monday said it has removed about 10 million profiles for impersonating large content producers through the first half of 2025 as part of an effort by the company to combat “spammy content.”
The crackdown is part of Meta’s broader effort to make the Facebook feed more relevant and authentic by taking action against and removing accounts that engage in “spammy” behavior, such as content created using artificial intelligence tools.
As part of that initiative, Meta is also rolling out stricter measures to promote original posts from creators, the company said in a blog post.
Facebook also took action against approximately 500,000 accounts that it identified to be engaged in inauthentic behavior and spam. These actions included demoting comments and reducing distribution of content, which are intended to make it harder for these accounts to monetize their posts.
Meta said unoriginal content is when images or videos are reused without crediting the original creator. Meta said it now has technology that will detect duplicate videos and reduce the distribution of that content.
The action against spam and inauthentic content comes as Meta increases its investment in AI, with CEO Mark Zuckerberg on Monday announcing plans to spend “hundreds of billions of dollars” on AI compute infrastructure to bring the company’s first supercluster online next year.
This mandate comes at a time when AI is making it easier to mass-produce content across social media platforms. Other platforms are also taking action to combat the increase of spammy, low-quality content on social media, also known as “AI slop.”
Google’s YouTube announced a change in policy this month that prevents content that is mass-produced or repetitive from being eligible for being awarded revenue.
This announcement sparked confusion on social media, with many users believing this was a reversal on YouTube’s stance on AI content. However, YouTube clarified that the policy change is aimed at curbing unoriginal, spammy and repetitive videos.
“We welcome creators using AI tools to enhance their storytelling, and channels that use AI in their content remain eligible to monetize,” said a spokesperson for YouTube in a blog post to clarify the new policy.
YouTube’s new policy change will take effect on Tuesday.