Plaintiffs in Kadrey et al. vs. Meta File Motion, Accuse Firm of Knowingly Using Copyrighted Works in AI Model Development
The plaintiffs in the case of Kadrey et al. vs. Meta have filed a motion alleging that the firm knowingly used copyrighted works in the development of its AI models. The motion, which was filed in the United States District Court in the Northern District of California, accuses Meta of systematically torrenting and stripping copyright management information (CMI) from pirated datasets, including works from the notorious shadow library LibGen.
Internal Memo Reveals LibGen’s True Nature
According to documents recently submitted to the court, evidence reveals highly incriminating practices involving Meta’s senior leaders. A December 2024 memo from internal Meta discussions acknowledged LibGen as "a dataset we know to be pirated," with debates arising about the ethical and legal ramifications of using such materials. Documents also revealed that top engineers hesitated to torrent the datasets, citing concerns about using corporate laptops for potentially unlawful activities.
Meta’s Practices Raise Concerns
The allegations against Meta paint a portrait of a company knowingly partaking in a widespread piracy scheme facilitated through torrenting. According to a string of emails included as exhibits, Meta engineers expressed concerns about the optics of torrenting pirated datasets from within corporate spaces. One engineer noted that "torrenting from a [Meta-owned] corporate laptop doesn’t feel right," but despite hesitation, the rapid downloading and distribution – or "seeding" – of pirated data took place.
Legal Ramifications
The case originally began as an intellectual property infringement action on behalf of authors and publishers claiming violations relating to AI use of their materials. However, the plaintiffs are now seeking to add two major claims to their suit: a violation of the Digital Millennium Copyright Act (DMCA) and a breach of the California Comprehensive Data Access and Fraud Act (CDAFA).
Impact on Emerging Legislation
The unfolding case of Kadrey et al. vs. Meta could have far-reaching ramifications for the development of AI models moving forward, potentially setting legal precedents in the US and beyond. At the heart of this expanding legal battle lies growing concern over the intersection of copyright law and AI. Plaintiffs argue that the stripping of copyright protections from textual datasets denies rightful compensation to copyright owners and allows Meta to build AI systems like Llama on the financial ruins of authors’ and publishers’ creative efforts.
Conclusion
The case of Kadrey et al. vs. Meta highlights the need for clearer guidance at an international level to protect both creators and innovators. As AI becomes the central focus of Meta’s future strategy, the allegations of reliance on pirated libraries are unlikely to help its ambitions of maintaining leadership in the field.
FAQs
Q: What is the case about?
A: The case is about Meta’s alleged use of copyrighted works in the development of its AI models.
Q: What is LibGen?
A: LibGen is a shadow library of pirated datasets.
Q: What are the allegations against Meta?
A: Meta is accused of systematically torrenting and stripping copyright management information (CMI) from pirated datasets, including works from LibGen.
Q: What are the potential implications of this case?
A: The case could have far-reaching ramifications for the development of AI models and the intersection of copyright law and AI.

