Publishers Sue to Block Google Over Pirated Books in AI Case

Hachette and Cengage seek to join suit alleging Google trained Gemini on pirated books scraped from Z‑Library and other sites, citing C4's 200M+ copyright marks.

  • Hachette Book Group and Cengage Group moved to join a California federal class action accusing Google of using copyrighted books to train its Gemini models.
  • The publishers say Google downloaded books from pirate sites such as Z‑Library, OceanofPDF and WeLib and repeatedly copied them into training systems.
  • The suit alleges Google’s C4 dataset drew from at least 28 piracy-linked sites and that the copyright symbol appears over 200 million times in C4.
  • The publishers request statutory damages, injunctions, destruction of unauthorized copies and disclosure of which books trained Gemini.
  • The complaint and the consolidated 2023 class action docket are publicly available as linked documents supporting the motion.

On Thursday, major publishers Hachette Book Group and Cengage Group filed a motion to intervene in a federal class action in California that accuses Google of copying books to train its Gemini AI models. The publishers attached a formal complaint that lays out their claims and seeks relief.

- Advertisement -

The filing says Google chose to take content without licenses and that the company “chose to steal a massive body of content from Plaintiffs and the Class to train its AI model,” alleging infringement “at every stage” of model development. The publishers claim Google downloaded works from pirate repositories, then copied them into memory, converted them to readable formats, and included them in training sets for successive models.

The complaint singles out Google’s C4 training dataset and alleges it contains works scraped from Z‑Library and at least 28 other sites the U.S. government has linked to piracy. The filing states that “The copyright symbol (©) appears more than 200 million times in the C4 dataset.” It also notes copies came from domains now displaying federal seizure notices and from subscription libraries such as Scribd.

The publishers ask the court for statutory damages, injunctions to stop further use, an order to destroy unauthorized copies, and disclosure of which books trained Gemini. They also cite a response from dataset provider Common Crawl that allegedly said, “You shouldn’t have put your content on the internet if you didn’t want it to be on the internet.”

The motion seeks to join an existing copyright action originally filed by authors in 2023; that consolidated case is available through the public docket. The filing follows a series of 2023 lawsuits over AI training data, where judges granted mixed rulings on fair use while criticizing long‑term retention of pirated works.

- Advertisement -

✅ Follow BITNEWSBOT on Telegram, Facebook, LinkedIn, X.com, and Google News for instant updates.

Previous Articles:

- Advertisement -

Latest News

BRICS Rising: Lula and Modi Boost India-Brazil Trade Amid Global Shift

Brazilian President Lula da Silva arrived in India with a 300-person delegation aiming to...

AI Firms Split Over Weapons, Trump Bans Anthropic

President Trump ordered federal agencies to phase out Anthropic's Ai technology, escalating a national...

Bitcoin Price Recovers Following Geopolitical Strikes

Bitcoin prices rebounded sharply to $68,200 following U.S.-Israeli airstrikes in Iran and the reported...

Cramer: Apple’s AI Ride Is Free Via Google Deal

CNBC's Jim Cramer says Apple is getting a "free ride" in AI through its...

Crypto Treasury Consolidation Looming as Firms Struggle

The crypto treasury market is expected to consolidate in 2025 as operating companies acquire...

Must Read

How Cryptocurrency Works For Beginners?

Welcome to the world of cryptocurrency! If you're new to this exciting and rapidly evolving landscape, you might feel like Alice in Wonderland, exploring...
🔥 #AD Get 20% OFF any new 12 month hosting plan from Hostinger. Click here!