Publishers Sue to Block Google Over Pirated Books in AI Case

Hachette and Cengage seek to join suit alleging Google trained Gemini on pirated books scraped from Z‑Library and other sites, citing C4's 200M+ copyright marks.

  • Hachette Book Group and Cengage Group moved to join a California federal class action accusing Google of using copyrighted books to train its Gemini models.
  • The publishers say Google downloaded books from pirate sites such as Z‑Library, OceanofPDF and WeLib and repeatedly copied them into training systems.
  • The suit alleges Google’s C4 dataset drew from at least 28 piracy-linked sites and that the copyright symbol appears over 200 million times in C4.
  • The publishers request statutory damages, injunctions, destruction of unauthorized copies and disclosure of which books trained Gemini.
  • The complaint and the consolidated 2023 class action docket are publicly available as linked documents supporting the motion.

On Thursday, major publishers Hachette Book Group and Cengage Group filed a motion to intervene in a federal class action in California that accuses Google of copying books to train its Gemini AI models. The publishers attached a formal complaint that lays out their claims and seeks relief.

- Advertisement -

The filing says Google chose to take content without licenses and that the company “chose to steal a massive body of content from Plaintiffs and the Class to train its AI model,” alleging infringement “at every stage” of model development. The publishers claim Google downloaded works from pirate repositories, then copied them into memory, converted them to readable formats, and included them in training sets for successive models.

The complaint singles out Google’s C4 training dataset and alleges it contains works scraped from Z‑Library and at least 28 other sites the U.S. government has linked to piracy. The filing states that “The copyright symbol (©) appears more than 200 million times in the C4 dataset.” It also notes copies came from domains now displaying federal seizure notices and from subscription libraries such as Scribd.

The publishers ask the court for statutory damages, injunctions to stop further use, an order to destroy unauthorized copies, and disclosure of which books trained Gemini. They also cite a response from dataset provider Common Crawl that allegedly said, “You shouldn’t have put your content on the internet if you didn’t want it to be on the internet.”

The motion seeks to join an existing copyright action originally filed by authors in 2023; that consolidated case is available through the public docket. The filing follows a series of 2023 lawsuits over AI training data, where judges granted mixed rulings on fair use while criticizing long‑term retention of pirated works.

- Advertisement -

✅ Follow BITNEWSBOT on Telegram, Facebook, LinkedIn, X.com, and Google News for instant updates.

Previous Articles:

- Advertisement -

Latest News

Bitcoin Below $70K Spurs Investor Split, Institutions Buy

Bitwise CEO Hunter Horsley notes a divergence in market sentiment, with long-time holders feeling...

SHIB Slumps Amid Market Woes Sell or Hold

Shiba Inu has struggled through 2025 and into 2026 amid a broad market downturn...

Bullish CEO Forecasts Major Crypto Industry Consolidation

According to Bullish CEO Tom Farley, the crypto sector is poised for significant consolidation,...

Retail Investors Hunt for Crypto Market Bottom Signals

Retail investors are looking for signs of market capitulation to time their entries, often...

Bithumb’s $1.37-to-$142M Bitcoin Error Sparks Crash

South Korean exchange Bithumb erroneously credited 695 users with 2,000 BTC (worth $142 million...

Must Read

6 Best VPN Providers That Accept Monero

Privacy and anonymity are probably the most important things that we should all consider in today's internet era. Although there are a lot of...
🔥 #AD Get 20% OFF any new 12 month hosting plan from Hostinger. Click here!