Britannica Is Suing OpenAI: The Theory That Could Break AI

Previous AI copyright cases targeted training data. Britannica and Merriam-Webster's March 2026 lawsuit targets something different: RAG — the live retrieval system that pulls real content from the web every time you ask ChatGPT a question. If they win, every AI search product that retrieves and summarizes web content must license that content. The fair use defense that has protected OpenAI in other cases does not apply here. This is the complete explanation of the most consequential AI legal case of 2026 — why it is different, what the Lanham Act claim means, and what victory for either side does to the AI industry.

By Shikhar Burman · 2026-03-23 · 12 min read · AI & Society

The Encyclopedia Britannica has survived since 1768 — through the printing press, the industrial revolution, the encyclopedia on CD-ROM, and the rise of Wikipedia. In March 2026, the company filed what may be its most consequential strategic decision in decades: a lawsuit against OpenAI that targets not the training of AI models, but the retrieval systems that power AI responses. Joined by Merriam-Webster (the most authoritative American dictionary, continuously published since 1828), the lawsuit introduces a legal theory that distinguishes it from all previous AI copyright cases and could fundamentally reshape how AI search and retrieval systems are built and licensed.

The Critical Distinction: Training Data vs. RAG

To understand why this lawsuit is different, you need to understand the difference between two ways AI uses content. Previous major AI copyright lawsuits — the New York Times case, the author lawsuits — primarily target training data: the copyrighted content used to train the model's weights during the learning process. OpenAI's main defense has been fair use: the argument that training on copyrighted content to build a new type of product is transformative use, similar to how a human reads books to develop expertise without licensing each one. Courts have not yet resolved this argument definitively.

The Britannica/Merriam-Webster lawsuit targets something different: RAG. Retrieval-Augmented Generation is a system where, when a user asks ChatGPT a question, the model dynamically retrieves passages from current web content (or a curated document database) and incorporates those retrieved passages into its response. The lawsuit alleges that when users ask ChatGPT questions that Britannica articles answer well, ChatGPT's RAG system retrieves Britannica content from the web and reproduces substantial portions of it in its responses — without a license and without directing users to Britannica's site. This is not training data copyright — it is live content reproduction, and the fair use argument is significantly harder to make for live reproduction than for training data use.

The Lanham Act Claim: Hallucinations Attributed to Britannica

The lawsuit includes a second, unusually creative legal theory: a Lanham Act claim based on hallucinations. The Lanham Act prohibits false designation of origin — essentially, falsely attributing content to a source. The plaintiffs allege that ChatGPT sometimes generates factually incorrect content and presents it as if it came from authoritative sources like Britannica and Merriam-Webster — creating the impression that Britannica endorsed the hallucinated content. This theory has not been tested in court for AI hallucination, and its success would create a novel liability framework for AI companies whose systems generate incorrect content while implying authoritative sourcing.

How This Case Fits Into the Growing Publisher vs. AI Litigation Landscape

New York Times v. OpenAI (ongoing): the NYT alleges both training data use and that ChatGPT reproduces NYT articles too closely in responses. OpenAI's main defense is fair use plus the argument that any close reproduction is a model bug, not an intended feature.
Ziff Davis and newspaper coalition lawsuits: a coalition of news publishers filed a coordinated set of lawsuits against AI companies in late 2024, all targeting training data use. These cases follow a similar theory to the NYT case.
The Author Guild cases: George R.R. Martin, John Grisham, and other prominent authors filed suit over training data use of their works. These cases address the creative writing domain rather than factual reference content.
What distinguishes the Britannica/Merriam case: all previous major cases focus primarily on training data. Britannica's RAG theory is new — it targets the live inference process, which has no good fair use defense if the reproduction is substantial. If the RAG theory succeeds, it could require AI companies to license content for retrieval use separately from training use.

What the Outcome Could Mean for AI Products

If Britannica wins on the RAG theory: AI companies may be required to license content from publishers before including it in RAG retrieval databases. This would create a new content licensing market for AI retrieval — similar to how music streaming services license music — and could significantly increase the cost structure of AI search products.
If OpenAI wins on fair use: a ruling that RAG retrieval constitutes fair use would validate the current architecture of most AI search products and resolve the most acute near-term legal risk for AI companies that use live web retrieval.
Likely settlement territory: given the reputational and financial risks of an adverse ruling for both sides, settlement is the most probable outcome. The settlement terms — particularly whether they involve a licensing payment or a content licensing framework — will be more revealing than the outcome would be if the case went to verdict.
The Merriam-Webster dimension: Merriam-Webster's claim is particularly interesting because dictionary definitions are among the most frequently retrieved factual content in AI search responses. A finding that AI companies must license dictionary content for retrieval use would affect virtually every major AI search product.

For anyone following AI copyright law: the most important case to track in 2026 is not the Britannica case itself but the New York Times case, which will likely be resolved first and establish the fair use framework that all subsequent cases will be evaluated against. The NYT case covers both training data and response reproduction — its outcome will define the boundaries within which the Britannica RAG theory must operate. The NYT case verdict, when it comes, will be the most consequential single legal event for the AI industry since its founding.

Britannica Is Suing OpenAI: The Theory That Could Break AI

The Critical Distinction: Training Data vs. RAG

The Lanham Act Claim: Hallucinations Attributed to Britannica

Also on LumiChats

AI & Society

OpenAI's $850B IPO: What It Means for Investors and You

14 min read→

AI & Society

Stanford's AI Index 2026 Just Confirmed Your Fears About Entry-Level Jobs. Here Are the 18 Numbers That Prove It — and the 6 Things You Can Actually Do About It.

17 min read→

AI & Society

Claude Opus 4.7 Just Landed. It Now Solves 87.6% of Real Coding Bugs Without You. And 78,000 Tech Jobs Are Already Gone This Quarter.

19 min read→

How This Case Fits Into the Growing Publisher vs. AI Litigation Landscape

New York Times v. OpenAI (ongoing): the NYT alleges both training data use and that ChatGPT reproduces NYT articles too closely in responses. OpenAI's main defense is fair use plus the argument that any close reproduction is a model bug, not an intended feature.
Ziff Davis and newspaper coalition lawsuits: a coalition of news publishers filed a coordinated set of lawsuits against AI companies in late 2024, all targeting training data use. These cases follow a similar theory to the NYT case.
The Author Guild cases: George R.R. Martin, John Grisham, and other prominent authors filed suit over training data use of their works. These cases address the creative writing domain rather than factual reference content.
What distinguishes the Britannica/Merriam case: all previous major cases focus primarily on training data. Britannica's RAG theory is new — it targets the live inference process, which has no good fair use defense if the reproduction is substantial. If the RAG theory succeeds, it could require AI companies to license content for retrieval use separately from training use.

What the Outcome Could Mean for AI Products

If Britannica wins on the RAG theory: AI companies may be required to license content from publishers before including it in RAG retrieval databases. This would create a new content licensing market for AI retrieval — similar to how music streaming services license music — and could significantly increase the cost structure of AI search products.
If OpenAI wins on fair use: a ruling that RAG retrieval constitutes fair use would validate the current architecture of most AI search products and resolve the most acute near-term legal risk for AI companies that use live web retrieval.
Likely settlement territory: given the reputational and financial risks of an adverse ruling for both sides, settlement is the most probable outcome. The settlement terms — particularly whether they involve a licensing payment or a content licensing framework — will be more revealing than the outcome would be if the case went to verdict.
The Merriam-Webster dimension: Merriam-Webster's claim is particularly interesting because dictionary definitions are among the most frequently retrieved factual content in AI search responses. A finding that AI companies must license dictionary content for retrieval use would affect virtually every major AI search product.

Pro Tip

Britannica Is Suing OpenAI: The Theory That Could Break AI

The Critical Distinction: Training Data vs. RAG

The Lanham Act Claim: Hallucinations Attributed to Britannica

How This Case Fits Into the Growing Publisher vs. AI Litigation Landscape

What the Outcome Could Mean for AI Products

Britannica Is Suing OpenAI: The Theory That Could Break AI

The Critical Distinction: Training Data vs. RAG

The Lanham Act Claim: Hallucinations Attributed to Britannica

How This Case Fits Into the Growing Publisher vs. AI Litigation Landscape

What the Outcome Could Mean for AI Products

Claude, GPT-5.4, Gemini —
all in one place.

Keep reading

The Critical Distinction: Training Data vs. RAG

The Lanham Act Claim: Hallucinations Attributed to Britannica

How This Case Fits Into the Growing Publisher vs. AI Litigation Landscape

What the Outcome Could Mean for AI Products

Claude, GPT-5.4, Gemini —all in one place.

Keep reading

Claude, GPT-5.4, Gemini —
all in one place.