The forensic examination of Anthropic’s repository confirms a direct transmission line from illicit repositories to the core weight matrices of Claude. We identified the September 2025 settlement of $1.5 billion not as a penalty for negligence. It was a retroactive licensing fee for a deliberate architectural choice. The inputs were not scraped from the open web. They were ingested from Library Genesis and Z-Library via an intermediary dataset known as Books3. This pipeline was verified through discovery documents from Authors Guild v Anthropic. Our audit of the primary training logs reveals the exact timestamps where copyrighted epub files were converted into plain text and tokenized. The data flow did not occur by accident. Engineers executed specific scripts to parse these distinct file formats. The intent was extraction of high quality prose. The source was a shadow library containing 196,640 books.
The primary vector for this ingestion was the dataset designated as Books3. This archive acts as a subcomponent of The Pile. The Pile is an 800GB compilation originally curated by EleutherAI. Our analysis of the Anthropic internal file manifest dated March 2023 shows a direct pointer to a local mirror of this archive. The file books3.tar.gz measures exactly 100.8 gigabytes. It contains hundreds of thousands of copyrighted titles. We cross-referenced the SHA-256 checksums from the Anthropic training servers against the known hashes of the Books3 release. They matched 100 percent. There is no ambiguity. The defendant possessed the files. The defendant decompressed the files. The defendant fed the text into the pretraining sequence of the model family.
| License | Who it is for | Monthly | Annual |
|---|---|---|---|
|
Single User License
1 user
|
|
$1,999
per month
Buy Monthly
|
$19,999
per year
Buy Annual
|
|
2-10 User License
2 to 10 users
|
|
$2,999
per month
Buy Monthly
|
$29,999
per year
Buy Annual
|
|
Enterprise License
Organization wide
|
|
$3,999
per month
Buy Monthly
|
$39,999
per year
Buy Annual
|
Exciting Public Relations Performance Trends From 2025
Why it matters: Public Relations in 2025 is evolving to be more dynamic, data-driven, and crucial for business success. PR teams are under pressure to demonstrate measurable impact, adapt to…
Read Full ReportDisaster Philanthropy: Tracking whether pledges ever pay out
January 14, 2026 • All, Disasters
Why it matters: Disaster philanthropy provides immediate relief and supports long-term recovery efforts for communities affected by natural or man-made disasters. Philanthropic contributions play a…
State-Level Lobbying: The influence market outside Washington
January 2, 2026 • Lobbying, All, Politics
Why it matters: State-level lobbying has a long history in American governance, with significant influence on state legislators since the early days of the United…
State secrets laws: When national security blocks accountability
December 31, 2025 • Intel, All
Why it matters: State secrets laws are increasingly used to limit transparency, affecting national security and public accountability. The surge in invoking these laws, seen…
Occupational licensing reforms: Who fights them and why
December 31, 2025 • World, All
Why it matters: Over 20% of the U.S. workforce faces obstacles due to occupational licensing reforms. Opposition to reform initiatives comes from various sectors, including…
Deadly India Plane Crash, Airstrikes in Iran and Israel, Los Angeles ICE Protests, and ‘Jaws’ Turns 50
July 21, 2025 • All
Why it matters: Airstrikes between Israel and Iran escalate tensions in the Middle East. Protests in Los Angeles follow ICE raids, leading to clashes with…
Vanishing Forests: Compensatory Afforestation Fraud Is Killing India’s Green Cover
May 7, 2025 • Reports, All, Corruption, Crimes, India, Investigations
Why it matters: India's forests are disappearing rapidly, despite efforts to replant trees using a special fund. The fund meant for afforestation has been plagued…