The New York Times on Wednesday filed a lawsuit against Microsoft and OpenAI, creator of the popular AI chatbot ChatGPT, accusing the companies of copyright infringement and abusing the newspaper’s intellectual property to train large language models.
Microsoft both invests in and supplies OpenAI, providing it with access to the company’s Azure cloud computing technology.
The publisher said in a filing in the U.S. District Court for the Southern District of New York that it seeks to hold Microsoft and OpenAI to account for the “billions of dollars in statutory and actual damages” it believes it is owed for the “unlawful copying and use of The Times’s uniquely valuable works.”
The Times said in an emailed statement that it “recognizes the power and potential of GenAI for the public and for journalism,” but added that journalistic material should be used for commercial gain with permission from the original source.
“These tools were built with and continue to use independent journalism and content that is only available because we and our peers reported, edited, and fact-checked it at high cost and with considerable expertise,” the Times said.
The New York Times Building in New York City on February 1, 2022.
Angela Weiss | AFP | Getty Images
“Settled copyright law protects our journalism and content,” the Times added. “If Microsoft and OpenAI want to use our work for commercial purposes, the law requires that they first obtain our permission. They have not done so.”
“We respect the rights of content creators and owners and are committed to working with them to ensure they benefit from AI technology and new revenue models,” an OpenAI representative said in a statement. “Our ongoing conversations with the New York Times have been productive and moving forward constructively, so we are surprised and disappointed with this development. We’re hopeful that we will find a mutually beneficial way to work together, as we are doing with many other publishers.”
A representative for Microsoft didn’t respond to requests for comment.
The Times is represented in the proceedings by Susman Godfrey, the litigation firm that represented Dominion Voting Systems in its defamation suit against Fox News that culminated in a $787.5 million million settlement.
Susman Godfrey is also representing author Julian Sancton and other writers in a separate lawsuit against OpenAI and Microsoft that accuses the companies of using copyrighted materials without permission to train several versions of ChatGPT.
‘Mass copyright infringement’
The Times is one of numerous media organizations pursuing compensation from companies behind some of the most advanced artificial intelligence models, for the alleged usage of their content to train AI programs.
OpenAI is the creator of GPT, a large language model that can produce humanlike content in response to user prompts. It uses billions of parameters’ worth of information, which is obtained from public web data up until 2021.
Media publishers and content creators are finding their materials being used and reimagined by generative AI tools like ChatGPT, Dall-E, Midjourney and Stable Diffusion. In numerous cases, the content the programs produce can look similar to the source material.
OpenAI has tried to allay news publishers’ concerns. In December, the company announced a partnership with Axel Springer — the parent company of Business Insider, Politico, and European outlets Bild and Welt — which would license its content to OpenAI in return for a fee.
The financial terms of the deal weren’t disclosed.
In its lawsuit Wednesday, the Times accused Microsoft and OpenAI of creating a business model based on “mass copyright infringement,” stating that the companies’ AI systems were “used to create multiple reproductions of The Times’s intellectual property for the purpose of creating the GPT models that exploit and, in many cases, retain large portions of the copyrightable expression contained in those works.”
Publishers are concerned that, with the advent of generative AI chatbots, fewer people will click through to news sites, resulting in shrinking traffic and revenues.
The Times included numerous examples in the suit of instances where GPT-4 produced altered versions of material published by the newspaper.
In one example, the filing shows OpenAI’s software producing almost identical text to a Times article about predatory lending practices in New York City’s taxi industry.
But in OpenAI’s version, GPT-4 excludes a critical piece of context about the sum of money the city made selling taxi medallions and collecting taxes on private sales.
In its suit, the Times said Microsoft and OpenAI’s GPT models “directly compete with Times content.”
The AI models also limited the Times’ commercial opportunities by altering its content. For example, the publisher alleges GPT outputs remove links to products featured in its Wirecutter app, a product reviews platform, “thereby depriving The Times of the opportunity to receive referral revenue and appropriating that opportunity for Defendants.”
The Times also alleged Microsoft and OpenAI models produce content similar to that generated by the newspaper, and that their use of its content to train LLMs without consent “constitutes free-riding on The Times’s significant efforts and investment of human capital to gather this information.”
The Times said Microsoft and OpenAI’s LLMs “can generate output that recites Times content verbatim, closely summarizes it, and mimics its expressive style,” and “wrongly attribute false information to The Times,” and “deprive The Times of subscription, licensing, advertising, and affiliate revenue.”
— CNBC’s Rohan Goswami contributed to this report.
LISBON — Samsung’s foray into smart rings isn’t concerning the boss of the product category’s pioneer, Oura — in fact, Tom Hale says he’s seeing a boost in business.
“I’m sure that a major tech company making an announcement saying: ‘Hey, this is a category that matters. It’s going to be something that’s big.’ I think it’s probably helpful,” Hale told CNBC in an interview this week.
“In terms of the impact on our business, it has made zero impact. If anything, our business has gotten stronger since their announcement.”
In a wide-ranging interview with CNBC at the Web Summit conference in Lisbon, Hale discussed Oura’s plans for new areas of insight it wants to give users, how he is thinking about new devices and the company’s intentions for international expansion.
Oura’s flagship product is the Oura Ring 4, a device known as a smart ring. It is packed with sensors that can track some health metrics, allowing Oura app users to learn more about the quality of their sleep or how ready they are to tackle the day ahead.
Founded in Finland in 2013, the company has been called a pioneer by analysts in the smart ring space. Oura said it has sold more than 2.5 million of its rings since it launched its first product. CCS Insight forecasts Oura will end the year with a 49% market share in smart rings.
Competition is starting to rear its head in the space. The world’s largest smartphone maker Samsung made its first venture into smart rings this year with the Galaxy Ring, which some analysts say has put the device category on the map and popularized it with a broader audience.
Hale is keen to position Oura as a “health company and a science company from the get-go,” with the aim of its product being “clinical grade.” Oura is seeking approval from the U.S. Food and Drug Administration (FDA) for its ring to be used for diagnostics, although Hale declined to provide too many further details.
He did say that Oura’s focus on health and science is what sets it apart from competitors.
“If you’re actually thinking [of] yourself as a healthcare company, it is very different in many ways and different postures you might take towards data privacy. … So instead of being like a tech company where data is some sort of oil to be extracted and then used to create some kind of advantage of network effects, we’re really a healthcare company where your data is sacrosanct,” Hale said.
Oura’s business model relies on selling the hardware, as well as on a $5.99 monthly subscription service that allows users to get the insights from their ring. Oura says it has nearly 2 million subscribers.
“We look more like a software company than we do look like a hardware company. And I think that’s a function of the business model, and the fact that it’s working. Our subscribers are continuing to pay,” Hale said.
Oura eyes nutrition as next ‘pillar’
Oura takes the data gathered by the ring to provide insight to its users, focused on a person’s levels of sleep, activity and readiness to take on the day.
Hale said the company is now testing out nutrition, with users able to take a picture of their meal and log it into the Oura app. Also in the nutrition space, he highlighted Oura’s recent acquisition of Veri, a metabolic health startup that can take data from continuous glucose monitors — small devices inserted into a person’s arm — to give insight into someone’s blood sugar levels. Hale says that this, combined with Oura’s food tracking feature, could tell a user how certain meals affect their glucose levels.
Many glucose monitors today are invasive and need to be inserted into the skin. Some observers see a non-invasive glucose monitor on wearable gear as something that could be transformative — but Hale warns this is a difficult goal to achieve.
“The idea that a wearable [device] will get there, I think, has definitely been a Holy Grail, and like the Holy Grail, they may never find it, because it’s a very difficult problem to solve with any kind of accuracy,” Hale said.
“Never say never. Certainly, technology continues to advance and all the capabilities continue to advance,” he added.
New hardware and AI
While Oura only sells rings currently, Hale sees the company developing new products in the future. He declined to elaborate.
“I think we’ll undoubtedly see other Oura-branded products, beyond the ring,” he promised.
He also said the company hopes to work with other devices as well, even if they are not Oura’s own hardware.
Like many hardware companies, such as Apple and Samsung, Oura is looking at ways it can use the advancing capabilities of artificial intelligence to give users more personalized insights. Smartphone makers have spoken about so-called “AI agents,” which they see as assistants that are able to anticipate what a user wants.
Oura is testing out an AI product called Oura Advisor in a similar vein.
“Think of it as the doctor in your pocket that knows all the data about you,” Hale said.
International push
Hale‘s presence at the Web Summit in Lisbon underscores his push to raise Oura’s brand awareness in markets outside of the U.S., especially as more people learn about smart rings.
“I think the point about the category being something that people are learning about, the unique benefits of that maturity, is in our favor. We’re expanding internationally,” Hale said.
He said he is particularly “excited” about venturing into Western Europe, including in countries like the U.K., Germany, France and Italy. Looking even further forward, Hale said an initial public offering for the business is not currently on the table, adding that operating as a private company gives Oura more “freedom.”
“I really enjoy the freedom that we get as a private company. We’re accountable to our investors and our shareholders, but they’re willing to let us operate with a lot license,” he said. “And if we decided we wanted to turn unprofitable because we wanted to invest in owning some category of healthcare software, it’ll be fine. They would be happy for that.”
LISBON, Portugal — Tech giants are increasingly investing in the development of so-called “sovereign” artificial intelligence models as they seek to boost competitiveness by focusing more on local infrastructure.
Data sovereignty refers to the idea that people’s data should be stored on infrastructure within the country or continent they reside in.
“Sovereign AI is a relatively new term that’s emerged in the last year or so,” Chris Gow, IT networking giant Cisco’s Brussels-based EU public policy lead, told CNBC.
Currently, many of the biggest large language models (LLMs), like OpenAI’s ChatGPT and Anthropic’s Claude, use data centers based in the U.S. to store data and process requests via the cloud.
This has led to concern from politicians and regulators in Europe, who see dependence on U.S. technology as harmful to the continent’s competitiveness — and, more worryingly, technological resilience.
Where did ‘AI sovereignty’ come from?
The notion of data and technological sovereignty is something that has previously been on Europe’s agenda. It came about, in part, as a result of businesses reacting to new regulations.
The European Union’s General Data Protection Regulation, for example, requires companies to handle user data in a secure, compliant way that respects their right to privacy. High-profile cases in the EU have also raised doubts over whether data on European citizens can be transferred across borders safely.
The European Court of Justice in 2020 invalidated an EU-U.S. data-sharing framework, on the grounds that the pact did not afford the same level of protection as guaranteed within the EU by the General Data Protection Regulation (GDPR). Last year the EU-U.S. Data Privacy Framework was formed to ensure that data can flow safely between the EU and U.S.
These political development have ultimately resulted in a push toward localization of cloud infrastructure, where data is stored and processed for many online services.
Filippo Sanesi, global head of marketing and operations at OVHCloud, said the French cloud firm is seeing lots of demand for its European-located infrastructure, as they “understand the value of having their data in Europe, which are subject to European legislation.”
“As this concept of data sovereignty becomes more mature and people understand what it means, we see more and more companies understanding the importance of having your data locally and under a specific jurisdiction and governance,” Sanesi told CNBC. “We have a lot of data,” he added. “This data is sovereign in specific countries, under specific regulations.”
“Now, with this data, you can actually make products and services for AI, and those services should then be sovereign, should be controlled, deployed and developed locally by local talent for the local population or businesses.”
The AI sovereignty push hasn’t been driven forward by regulators — at least, not yet, according to Cisco’s Gow. Rather, it’s come from private companies, which are opening more data centers — facilities containing vast amounts of computing equipment to enable cloud-based AI tools — in Europe, he said.
Sovereign AI is “more driven by the industry naming it that, than it is from the policymakers’ side,” Gow said. “You don’t see the ‘AI sovereignty’ terminology used on the regulator side yet.”
Countries are pushing the idea of AI sovereignty because they recognize AI is “the future” and a “massively strategic technology,” Gow said.
Governments are focusing on boosting their domestic tech companies and ecosystems, as well as the all-important backend infrastructure that enables AI services.
“The AI workload uses 20 times the bandwidth of a traditional workload,” Gow said. It’s also about enabling the workforce, according to Gow, as firms need skilled workers to be successful.
Most important of all, however, is the data. “What you’re seeing is quite a few attempts from that side to think about training LLMs on localized data, in language,” Gow said.
The aim of the Italia project is to store results in a given jurisdiction and rely on data from citizens within that region so that results produced by the AI systems there are more grounded in local languages, culture and history.
“Sovereign AI is about reflecting the values of an organization or, equally, the country that you’re in and the values and the language,” David Hogan, EMEA head of enterprise sales for chipmaking giant Nvidia, told CNBC.
“The core challenge is that most of the frontier models today have been trained primarily on Western data generally,” Hogan added.
In Denmark for example, where Nvidia has a major presence, officials are concerned about vital services such as health care and telecoms being delivered by AI systems that aren’t “reflective” of local Danish culture and values, according to Hogan.
On Wednesday, Denmark laid out a landmark white paper outlining how companies can use AI in compliance with the incoming EU AI Act — the world’s first major AI law. The document is meant to serve as a blueprint for other EU nations to follow and adopt.
“If you’re in a European country that’s not one of the major language countries that’s spoken internationally, probably less than 2% of the data is trained on your language — let alone your culture,” Hogan said.
How regulation fueled a mindset shift
That’s not to say regulations haven’t proven an important factor in getting tech giants to think more about building localized AI infrastructure within Europe.
OVHCloud’s Sanesi said regulations like the EU’s GDPR catalyzed a lot of the interest in onshoring the processing of data in a given region.
The concept of AI sovereignty is also getting buy-in from local European tech firms.
Earlier this week, Berlin-headquartered search engine Ecosia and its Paris-based peer Qwant announced a joint venture to develop a European search index from scratch, aiming to serve improved French and German language results.
Meanwhile, French telecom operator Orange has said it’s in discussions with a number of foundational AI model companies about building a smartphone-based “sovereign AI” model for its customers that more accurately reflects their own language and culture.
“It wouldn’t make sense to build our own LLMs. So there’s a lot of discussion right now about, how do we partner with existing providers to make it more local and safer?” Bruno Zerbib, Orange’s chief technology officer, told CNBC.
“There are a lot of use cases where [AI data] can be processed locally [on a phone] instead of processed on the cloud,” Zerbib added. Orange hasn’t yet selected a partner for these sovereign AI model ambitions.
In this photo illustration, the Bluesky Social logo is displayed on a cell phone in Rio de Janeiro, Brazil, on September 4, 2024.
Mauro Pimentel | AFP | Getty Images
Micro-blogging startup Bluesky has gained over 1.25 million new users in the past week, indicating some social media users are changing their habits following the U.S. presidential election.
Bluesky’s influx of users shows that the app has been able to pitch itself as an alternative to X, formerly Twitter, which is owned by Elon Musk, as well as Meta’s Threads. The bulk of the new users are coming from the U.S., Canada and the United Kingdom, the company said Wednesday.
“We’re excited to welcome everyone looking for a better social media experience,” Bluesky CEO Jay Graber told CNBC in a statement.
Despite the surge of users, Bluesky’s total base remains a fraction of its rivals’. The Seattle startup claims 15.2 million total users. Meta CEO Mark Zuckerberg in October said Threads had nearly 275 million monthly users. Musk in May claimed that X had 600 million monthly users, but market intelligence firm Sensor Tower pegged X’s monthly base at 318 million users in October.
Created in 2019 as a project inside Twitter, when Jack Dorsey was still CEO, Bluesky doesn’t show ads and has yet to develop a business model. It became an independent company in 2021. Dorsey said in May of this year that he’s no longer a member of Bluesky’s board.
“Journalists, politicians, and news junkies have also been talking up Bluesky as a better X alternative than Threads,” wrote Similarweb, the internet traffic and monitoring service, in a Tuesday blog.
Some users with new Bluesky accounts posted that they had moved to the service due to Musk and his support for President-elect Donald Trump.
“It’s appalling that Elon Musk has transformed Twitter into a Trump propaganda machine, rife with disinformation and misinformation,” one user posted on Bluesky.
This is Bluesky’s second notable surge in the last couple of months.
Bluesky said it picked up 2 million new users in September after the Brazilian Supreme Court suspended X in the country for failing to comply with regional content moderation policies and not appointing a local representative.