New York Times Says OpenAI Erased Potential Lawsuit Evidence

The New York Times Sues OpenAI and Microsoft Over AI Models

Lawsuits are never exactly a lovefest, but the copyright fight between The New York Times and both OpenAI and Microsoft is getting especially contentious.

The Allegations

This week, the Times alleged that OpenAI’s engineers inadvertently erased data the paper’s team spent more than 150 hours extracting as potential evidence.

Data Erasure

OpenAI was able to recover much of the data, but the Times’ legal team says it’s still missing the original file names and folder structure. According to a declaration filed to the court Wednesday by Jennifer B. Maisel, a lawyer for the newspaper, this means the information “cannot be used to determine where the news plaintiffs’ copied articles” may have been incorporated into OpenAI’s artificial intelligence models.

Reactions

“We disagree with the characterizations made and will file our response soon,” OpenAI spokesperson Jason Deutrom told WIRED in a statement. The New York Times declined to comment.

The Background

The Times filed its copyright lawsuit against OpenAI and Microsoft last year, alleging that the companies had illegally used its articles to train artificial intelligence tools like ChatGPT. The case is one of many ongoing legal battles between AI companies and publishers, including a similar lawsuit filed by the Daily News being handled by some of the same lawyers.

The Discovery Process

The Times’ case is currently in discovery, which means both sides are turning over requested documents and information that could become evidence. As part of the process, OpenAI was required by the court to show the Times its training data, which is a big deal—OpenAI has never publicly revealed exactly what information was used to build its AI models.

Data Sandbox

To disclose it, OpenAI created what the court is calling a “sandbox” of two “virtual machines” that the Times’ lawyers could sift through. In her declaration, Maisel said that OpenAI engineers had “erased” data organized by the Times’ team on one of these machines.

The Consequences

According to Maisel’s filing, OpenAI acknowledged that the information had been deleted, and attempted to address the issue shortly after it was alerted to it earlier this month. But when the paper’s lawyers looked at the “restored” data, it was too disorganized, forcing them “to recreate their work from scratch using significant person-hours and computer processing time,” several other Times lawyers said in a letter filed to the judge the same day as Maisel’s declaration.

Conclusion

The copyright fight between The New York Times and OpenAI and Microsoft is a contentious one, with both sides trading accusations and counter-accusations. The outcome of the case remains to be seen, but it is clear that the use of AI models in the publishing industry is a complex and highly regulated issue.

FAQs

Q: What is the dispute about?
A: The dispute is about the use of The New York Times’ articles to train OpenAI’s artificial intelligence models, without permission.

Q: What happened to the data?
A: According to The New York Times, OpenAI engineers inadvertently erased data that the paper’s team had spent more than 150 hours extracting as potential evidence.

Q: Has OpenAI acknowledged the data erasure?
A: Yes, OpenAI has acknowledged that the information had been deleted, and attempted to address the issue shortly after it was alerted to it earlier this month.

Q: What is the significance of the discovery process?
A: The discovery process is a critical part of the lawsuit, as both sides are required to turn over requested documents and information that could become evidence. In this case, OpenAI was required to show The New York Times its training data, which is a big deal.

Q: What are the implications of the case?
A: The outcome of the case could have significant implications for the use of AI models in the publishing industry, and the regulation of copyright in the digital age.

Post Views: 71

New York Times Says OpenAI Erased Potential Lawsuit Evidence

The New York Times Sues OpenAI and Microsoft Over AI Models

The Allegations

Data Erasure

Reactions

The Background

The Discovery Process

Data Sandbox

The Consequences

Conclusion

FAQs

Goodwood Festival of Speed unveils Future Lab lineup for 2026

Generate single title from this title Build an AI-Powered Equipment Repair Assistant Using Amazon Bedrock AgentCore in 100 -150 characters. And it must return...

LLMs help robots understand vague instructions and focus on key details | MIT News

We Ranked #11 on the Top 100 Inspiring Workplaces List. Here’s What Got Us There.

SmartThings Blog

Goodwood Festival of Speed unveils Future Lab lineup for 2026

Generate single title from this title Build an AI-Powered Equipment Repair Assistant Using Amazon Bedrock AgentCore in 100 -150 characters. And it must return...

LLMs help robots understand vague instructions and focus on key details | MIT News

We Ranked #11 on the Top 100 Inspiring Workplaces List. Here’s What Got Us There.

SmartThings Blog

How to Build an Employee Recognition Budget That Actually Gets Approved

Exploring the societal impacts of AI | MIT News

SmartThings Blog

LEAVE A REPLY Cancel reply

Latest

Goodwood Festival of Speed unveils Future Lab lineup for 2026

Generate single title from this title Build an AI-Powered Equipment Repair Assistant Using Amazon Bedrock AgentCore in 100 -150 characters. And it must return...

LLMs help robots understand vague instructions and focus on key details | MIT News

Categories

Useful Links

Our Newsletter