Not the sincerest form of flattery — NY Times copyright suit wants OpenAI to delete all GPT instances Shows evidence that GPT-based systems will reproduce Times articles if asked.
John Timmer – Dec 27, 2023 7:05 pm UTC Enlarge / Microsoft is named in the suit for allegedly building the system that allowed GPT derivatives to be trained using infringing material.Just_Super reader comments 359
In August, word leaked out that The New York Times was considering joining the growing legion of creators that are suing AI companies for misappropriating their content. The Times had reportedly been negotiating with OpenAI regarding the potential to license its material, but those talks had not gone smoothly. So, eight months after the company was reportedly considering suing, the suit has now been filed.
The Times is targeting various companies under the OpenAI umbrella, as well as Microsoft, an OpenAI partner that both uses it to power its Copilot service and helped provide the infrastructure for training the GPT Large Language Model. But the suit goes well beyond the use of copyrighted material in training, alleging that OpenAI-powered software will happily circumvent the Times’ paywall and ascribe hallucinated misinformation to the Times. Journalism is expensive
The suit notes that The Times maintains a large staff that allows it to do things like dedicate reporters to a huge range of beats and engage in important investigative journalism, among other things. Because of those investments, the newspaper is often considered an authoritative source on many matters.
All of that costs money, and The Times earns that by limiting access to its reporting through a robust paywall. In addition, each print edition has a copyright notification, the Times’ terms of service limit the copying and use of any published material, and it can be selective about how it licenses its stories. In addition to driving revenue, these restrictions also help it to maintain its reputation as an authoritative voice by controlling how its works appear.
The suit alleges that OpenAI-developed tools undermine all of that. “By providing Times content without The Timess permission or authorization, Defendants tools undermine and damage The Timess relationship with its readers and deprive The Times of subscription, licensing, advertising, and affiliate revenue,” the suit alleges.
Part of the unauthorized use The Times alleges came during the training of various versions of GPT. Prior to GPT-3.5, information about the training dataset was made public. One of the sources used is a large collection of online material called “Common Crawl,” which the suit alleges contains information from 16 million unique records from sites published by The Times. That places the Times as the third most referenced source, behind Wikipedia and a database of US patents. Advertisement
OpenAI no longer discloses as many details of the data used for training of recent GPT versions, but all indications are that full-text NY Times articles are still part of that process (Much more on that in a moment.) Expect access to training information to be a major issue during discovery if this case moves forward. Not just training
A number of suits have been filed regarding the use of copyrighted material during training of AI systems. But the Times’ suit goes well beyond that to show how the material ingested during training can come back out during use. “Defendants GenAI tools can generate output that recites Times content verbatim, closely summarizes it, and mimics its expressive style, as demonstrated by scores of examples,” the suit alleges.
The suit allegesand we were able to verifythat it’s comically easy to get GPT-powered systems to offer up content that is normally protected by the Times’ paywall. The suit shows a number of examples of GPT-4 reproducing large sections of articles nearly verbatim.
The suit includes screenshots of ChatGPT being given the title of a piece at The New York Times and asked for the first paragraph, which it delivers. Getting the ensuing text is apparently as simple as repeatedly asking for the next paragraph.
ChatGPT has apparently closed that loophole in between the preparation of that suit and the present. We entered some of the prompts shown in the suit, and were advised “I recommend checking The New York Times website or other reputable sources,” although we can’t rule out that context provided prior to that prompt could produce copyrighted material. Ask for a paragraph, and Copilot will hand you a wall of normally paywalled text.John Timmer
But not all loopholes have been closed. The suit also shows output from Bing Chat, since rebranded as Copilot. We were able to verify that asking for the first paragraph of a specific article at The Times caused Copilot to reproduce the first third of the article. Advertisement
The suit is dismissive of attempts to justify this as a form of fair use. “Publicly, Defendants insist that their conduct is protected as ‘fair use’ because their unlicensed use of copyrighted content to train GenAI models serves a new ‘transformative’ purpose,” the suit notes. “But there is nothing ‘transformative’ about using The Timess content without payment to create products that substitute for The Times and steal audiences away from it.” Reputational and other damages
The hallucinations common to AI also came under fire in the suit for potentially damaging the value of the Times’ reputation, and possibly damaging human health as a side effect. “A GPT model completely fabricated that The New York Times published an article on January 10, 2020, titled Study Finds Possible Link between Orange Juice and Non-Hodgkins Lymphoma, the suit alleges. “The Times never published such an article.”
Similarly, asking about a Times article on heart-healthy foods allegedly resulted in Copilot saying it contained a list of examples (which it didn’t). When asked for the list, 80 percent of the foods on weren’t even mentioned by the original article. In another case, recommendations were ascribed to the Wirecutter when the products hadn’t even been reviewed by its staff.
As with the Times material, it’s alleged that it’s possible to get Copilot to offer up large chunks of Wirecutter articles (The Wirecutter is owned by The New York Times). But the suit notes that these article excerpts have the affiliate links stripped out of them, keeping the Wirecutter from its primary source of revenue.
The suit targets various OpenAI companies for developing the software, as well as Microsoftthe latter for both offering OpenAI-powered services, and for having developed the computing systems that enabled the copyrighted material to be ingested during training. Allegations include direct, contributory, and vicarious copyright infringement, as well as DMCA and trademark violations. Finally, it alleges “Common Law Unfair Competition By Misappropriation.”
The suit seeks nothing less than the erasure of both any GPT instances that the parties have trained using material from the Times, as well as the destruction of the datasets that were used for the training. It also asks for a permanent injunction to prevent similar conduct in the future. The Times also wants money, lots and lots of money: “statutory damages, compensatory damages, restitution, disgorgement, and any other relief that may be permitted by law or equity.” reader comments 359 John Timmer John is Ars Technica’s science editor. He has a Bachelor of Arts in Biochemistry rom Columbia University, and a Ph.D. in Molecular and Cell Biology from the University of California, Berkeley. When physically separated from his keyboard, he tends to seek out a bicycle, or a scenic location for communing with his hiking boots. Advertisement Channel Ars Technica ← Previous story Next story → Related Stories Today on Ars
The trial of hip-hop mogul Sean “Diddy” Combs has begun, with the process to pick the jurors who will determine his fate now under way.
Combs, wearing a white shirt with a black crew-neck sweater, grey trousers and glasses, his hair and goatee now grey, hugged and shook hands with all his lawyers as he arrived at the federal courtroom in Manhattan, New York, for the start of the proceedings.
The 55-year-old has been held in detention in Brooklyn since he was arrested and charged in September 2024, accused of engaging in sex trafficking and presiding over a racketeering conspiracy over two decades.
He has pleaded not guilty to criminal charges, said all his sexual relationships and encounters were consensual, and strenuously denied all allegations of wrongdoing.
Image: The Metropolitan Detention Center, where Combs is incarcerated. Pic: AP/ Yuki Iwamura
Due to the high-profile nature of the case, the jury selection process is expected to last all this week, with opening statements by the lawyers set to begin next week.
Unlike some other high-profile trials in the US, this one won’t be broadcast live because federal courtrooms, unlike some state courtrooms, don’t allow electronic recordings inside.
Judge Arun Subramanian started proceedings shortly after 9am on Monday (2pm UK time), first making several rulings on what issues experts will be allowed to testify on when they take to the witness stand.
He then gave an overview of the case and began the questioning of prospective jurors one by one – a process known as “voir dire” – with the aim of finding a panel of 12 jurors and six alternates who can be fair and impartial despite heavy media coverage of the case.
Image: There are no cameras in court. Sketch: Elizabeth Williams via AP
Jurors are being asked if they have any views on the prosecution or the defence, if they or someone close to them has been a victim of crime, and their beliefs on hiring sex workers, the use of illegal drugs, hip-hop artists and law enforcement.
They are also being questioned on whether they have heard of names included on a list of individuals, including celebrities, who may be mentioned during the trial. The list is long, the court heard, with the judge saying it reminded him of Lord Of The Rings.
What have potential jurors been asked?
Image: Combs embraced his attorneys in court. Sketch: REUTERS/Jane Rosenberg
One prospective juror said they had heard of actors Michael B Jordan and Mike Myers, but this would not prevent them being fair and impartial should they be selected. Another said they had heard of Kanye West.
The context in which Jordan, Myers, West and other people may be mentioned is not yet known.
Other names that came up included Aubrey O’Day and Dawn Richard – former members of girl group Danity Kane, who were signed to Bad Boy – and singer Michelle Williams.
Several prospective jurors indicated they had seen news reports about Combs, and one prospective juror described a still image she had seen as “damning evidence”. She was rejected from consideration.
Another potential juror was excluded because she said a family member had experienced something that made them feel uncomfortable about hearing the case.
At one point during proceedings, Combs asked for a bathroom break, telling the judge: “I’m sorry your honour, I’m a little nervous today.”
One potential juror said they had seen a joke on social media about baby oil authorities say they found in Combs’ residences during searches in March 2024. They said they could remain impartial.
Throughout the day, as potential jurors were questioned, Combs appeared to express his approval or disapproval, either with a nod or by shaking his head no, to his attorneys.
Image: Brian Steel, one of Combs’ attorneys, pictured outside the court. Pic: REUTERS/David Dee Delgado
What is Combs accused of?
In the indictment listing the formal charges against the rapper, he is accused of a pattern of abusive behavior over two decades, allegedly with the help of people in his entourage.
Prosecutors say he manipulated women into participating in drug-fuelled sexual performances with male sex workers, which he called “Freak Offs”.
Combs and his associates resorted to violent acts, including beatings, kidnapping and arson, when he didn’t get his way, they allege.
Lawyers for Combs say any group sex was consensual, that there was no coercion involved, and nothing that happened amounted to a criminal racket.
If convicted, he faces the possibility of decades in prison.
Please use Chrome browser for a more accessible video player
2:53
What is Sean Combs on trial for?
The Cassie video
One issue likely to be featured in the trial is an incident in 2016, when a security camera recorded Combs allegedly kicking and hitting his then girlfriend Cassie Ventura in a Los Angeles hotel hallway.
Cassie filed a lawsuit in November 2023 saying Combs had subjected her to years of abuse, including beatings and rape, but the case was settled the following day.
The hotel footage emerged in May 2024. Shortly afterwards, Combs released a video apology, saying his behaviour in the video was at a time when he had “hit rock bottom” but nonetheless was “inexcusable” and that he was “disgusted” with himself.
One of his lawyers, Marc Agnifilo, has said Combs was “not a perfect person” and that there had been drug use and toxic relationships, but that all sexual activity between Combs, Cassie and other people was consensual.
Jury selection continues today and throughout this week, with the trial expected to last about eight weeks.
Michael Rothstein is a reporter for NFL Nation at ESPN. Rothstein covers the Atlanta Falcons. You can follow him via Twitter @MikeRothstein.
The Trump administration’s 2026 fiscal budget request to Congress eliminates major federal funding for traumatic brain injury (TBI) research and education, potentially undercutting efforts to address head injuries in sports, particularly at the high school and youth levels.
The White House’s proposed budget, released Friday, includes eliminating the Centers for Disease Control and Prevention umbrella agency responsible for TBI research, including the $8.25 million marked for brain injury research and public education about the dangers of concussions. The CDC is facing $3.59 billion in budget cuts.
Although the president proposes the federal budget, it is up to Congress to approve a final budget bill, so the TBI program could be restored or moved to a different agency. The White House did not respond to an ESPN request for comment.
The budget proposal comes after the CDC on April 1 placed all five staffers devoted to administering the government’s main traumatic brain injury program on paid administrative leave, CDC employees told ESPN. Paid administrative leave means the workers are still government employees.
The budget cuts would “roll back decades of progress,” said Dr. Owen Perlman, a brain injury specialist and board member of the Brain Injury Association of America.
Among the items targeted is Heads Up, a concussion-prevention program for youth and high school coaches, athletic trainers and other sports officials. The CDC staffers put on leave administered the program. Forty-five states participate in the program to varying degrees, a CDC official said, asking not to be identified.
Staffers interviewed by ESPN declined to speak on the record, citing fears of administration retribution.
“We’re really worried about the hundreds of thousands of coaches who have to take this training,” the CDC official said. “This is really built in, and we’ve lost the whole team” behind the program.
Some Heads Up training is part of coaches’ and other sports officials’ state compliance requirements. The CDC official said hundreds of email queries are arriving every week asking how to comply as the federal program shuts down. The Heads Up website says more than 10 million people have participated in its online training programs.
Congress first approved TBI research funding in 1996. Legislation to keep the program going expired at the end of 2024, and a House bill to renew it has yet to advance out of committee.
In a 2018 CDC survey, 12% of adult respondents reported experiencing a head injury in the previous 12 months, including but not limited to sports-related activities. A follow-up study was being prepared when the staffers were placed on leave. The research data was part of a program to measure TBI prevalence and boost prevention, care and recovery efforts.
The Heads Up website remained active Monday but offered no clues regarding the program’s endangered status.
“In the last month, I don’t think the public has felt an impact,” a laid-off CDC employee said. “But when those websites, trainings and materials get pulled down or when they can’t be updated, I think that’s when the public will feel it.”
In the proposed White House budget, the National Institutes of Health would retain an institute devoted to overall brain research, although the name would slightly change. The institute focuses on medical issues such as stroke and migraines, and it’s unclear whether TBI programs would be absorbed into it.
Hospitals and universities conducting TBI research funded by the CDC are bracing for potential funding cutbacks.
“We might not [get] the next year of renewal or the next wave of funding. And that’s sad and scary and impactful for all kinds of people, including myself in this project,” said Christine Baugh, an assistant professor at the University of Colorado’s School of Medicine who is studying how parents decide whether to let their children play contact sports and whether brain-injury awareness campaigns influence their decisions.
On April 23, the National Academy of Sciences received orders to cancel work on two TBI workshops, one of which analyzed the risks of repeated head impacts on children. Both workshops had already been held. One of the workshop organizers, Dr. Fred Rivara, a pediatrics professor at the University of Washington, told ESPN that the cancellation affected funding for publishing the information, and he called the potential cuts “tragic.”
“That’s a perfect example of how this change in, or devastation of, funding at the CDC is impacting people,” Rivara said. “They want to know, for sports: What about these repetitive impacts? Are they bad for kids? It’s a perfect example of the impact of this.”
Traumatic brain injuries have lifelong repercussions on a person’s physical, cognitive, emotional and behavioral health, Perlman said.
Even though some states fund TBI-treatment programs independently of the federal government, concerns are growing about a domino effect if Congress fails to renew funding.
“For many people with concussions or certainly moderate or severe brain injuries, there’s no endpoint,” Perlman said. “It’s a lifetime problem, and there needs to be lifetime funding for it.”
The first round of the 2025 Stanley Cup playoffs is complete. Eight of the teams that made the postseason bracket have moved on, and eight others have been eliminated.
Before the second-round series begin, ESPN’s experts have identified their picks for each matchup. Which four teams will move on to the conference finals?
John Buccigross: Panthers in seven Ryan Callahan: Panthers in six Cassie Campbell-Pascall: Panthers in six Sachin Chandan: Panthers in six Meghan Chayka: Panthers in six Ryan S. Clark: Panthers in seven Linda Cohn: Panthers in six Rachel Doerrie: Panthers in six Ray Ferraro: Panthers in six Emily Kaplan: Panthers in seven Tim Kavanagh: Maple Leafs in six Peter Lawrence-Riddell: Panthers in six Steve Levy: Panthers in six Vince Masi: Panthers in six Victoria Matiash: Panthers in six Sean McDonough: Panthers in six Mark Messier: Panthers in six AJ Mleczko: Panthers in six Arda Öcal: Maple Leafs in six Kristen Shilton: Maple Leafs in seven John Thoering: Panthers in six Bob Wischusen: Panthers in six Greg Wyshynski: Panthers in six
Consensus prediction: Panthers (20 of 23 picks)
Metropolitan Division
John Buccigross: Capitals in seven Ryan Callahan: Capitals in seven Cassie Campbell-Pascall: Capitals in six Sachin Chandan: Capitals in six Meghan Chayka: Hurricanes in six Ryan S. Clark: Capitals in seven Linda Cohn: Capitals in six Rachel Doerrie: Capitals in six Ray Ferraro: Capitals in seven Emily Kaplan: Capitals in seven Tim Kavanagh: Capitals in six Peter Lawrence-Riddell: Hurricanes in seven Steve Levy: Capitals in five Vince Masi: Hurricanes in six Victoria Matiash: Hurricanes in six Sean McDonough: Capitals in seven Mark Messier: Hurricanes in six AJ Mleczko: Hurricanes in five Mike Monaco: Hurricanes in six Arda Öcal: Capitals in six Kristen Shilton: Hurricanes in six John Thoering: Capitals in seven Bob Wischusen: Capitals in seven Greg Wyshynski: Capitals in seven
Consensus prediction: Capitals (16 of 24 picks)
Central Division
John Buccigross: Stars in seven Ryan Callahan: Stars in five Sachin Chandan: Stars in six Ryan S. Clark: Stars in seven Linda Cohn: Jets in seven Rachel Doerrie: Stars in six Ray Ferraro: Stars in six Emily Kaplan: Stars in six Tim Kavanagh: Stars in seven Peter Lawrence-Riddell: Stars in six Steve Levy: Stars in seven Vince Masi: Jets in seven Victoria Matiash: Jets in seven Sean McDonough: Stars in six Mark Messier: Stars in six Mike Monaco: Stars in six Arda Öcal: Stars in six Kristen Shilton: Stars in six John Thoering: Stars in seven Bob Wischusen: Jets in seven Greg Wyshynski: Stars in six
Consensus prediction: Stars (17 of 21 picks)
Pacific Division
John Buccigross: Oilers in seven Ryan Callahan: Golden Knights in six Cassie Campbell-Pascall: Oilers in seven Sachin Chandan: Oilers in seven Meghan Chayka: Golden Knights in seven Ryan S. Clark: Golden Knights in seven Linda Cohn: Oilers in seven Rachel Doerrie: Golden Knights in seven Ray Ferraro: Golden Knights in seven Emily Kaplan: Golden Knights in seven Tim Kavanagh: Golden Knights in six Peter Lawrence-Riddell: Golden Knights in six Steve Levy: Golden Knights in seven Vince Masi: Oilers in six Victoria Matiash: Golden Knights in six Sean McDonough: Golden Knights in seven Mark Messier: Oilers in seven AJ Mleczko: Golden Knights in six Mike Monaco: Oilers in six Arda Öcal: Oilers in six Kristen Shilton: Oilers in seven John Thoering: Golden Knights in seven Bob Wischusen: Golden Knights in seven Greg Wyshynski: Oilers in seven
Consensus prediction: Golden Knights (14 of 24 picks)