The New York Times sued OpenAI and Microsoft in December 2023, alleging they used millions of Times articles to train ChatGPT without permission. This is the case that will decide whether AI companies can legally scrape copyrighted content to build their models, and the answer affects every firm advising content creators, publishers, or AI companies.


Background

The New York Times invests hundreds of millions of dollars annually in journalism. Its reporting staff produces original content that drives subscriptions, advertising revenue, and the paper's reputation as a primary source. OpenAI and Microsoft used that content, along with vast amounts of other internet text, to train large language models including GPT-3.5 and GPT-4, which power ChatGPT and Microsoft's Copilot products.

The Times alleges that OpenAI's models can reproduce its articles nearly verbatim when prompted correctly. This isn't a theoretical concern. The complaint includes examples of ChatGPT outputting large passages of Times reporting word-for-word, effectively giving users the content without visiting the Times website or paying for a subscription.

Before filing suit, the Times and OpenAI attempted to negotiate a licensing deal. Those talks broke down, and the Times filed in the Southern District of New York on December 27, 2023. The case was assigned to Judge Sidney H. Stein and later consolidated into a multidistrict litigation (MDL No. 1:25-md-03143) with similar suits from other news organizations.

The New York Times Co. v. Microsoft Corp. (OpenAI)
No. 1:23-cv-11195 (S.D.N.Y.)
Court
U.S. District Court, Southern District of New York
Date
2023-12-27
Category
AI Liability / Copyright
Sanctions
None
AI Case Law — Updated April 2026

What Happened

OpenAI and Microsoft moved to dismiss the case, arguing that training AI models on publicly available content constitutes fair use under copyright law. They pointed to precedent suggesting that transformative uses of copyrighted material don't require permission, and that AI models don't simply copy content but learn patterns from it.

The Times fired back with evidence that ChatGPT could regurgitate its articles almost word-for-word. The paper argued this goes far beyond transformative use. It's substitution. If a user can get Times content from ChatGPT, the Times loses that reader, that subscription, that ad impression.

On April 4, 2025, Judge Stein largely denied the motion to dismiss. The court found the Times had adequately alleged its copyright infringement claims and that fair use couldn't be resolved without factual development. The case is now headed toward discovery and eventually trial.


The Ruling

Judge Stein's April 2025 ruling kept the core claims alive. The court held that the Times adequately alleged that OpenAI and Microsoft copied its copyrighted works to train their models and that the resulting AI outputs can substitute for the originals. The fair use defense, which is the AI industry's primary legal shield, couldn't be decided on the pleadings alone.

The court rejected the defendants' argument that AI training is inherently transformative. Whether it qualifies as fair use depends on facts that haven't been developed yet: how the training data was used, whether the outputs compete with the originals, and the economic impact on the Times.

The case is now in MDL proceedings with similar suits from other publishers. No trial date has been set, but the discovery phase will force OpenAI to reveal details about its training data, data acquisition practices, and the technical relationship between training inputs and model outputs.

Outcome: On April 4, 2025, Judge Stein largely denied the defendants' motion to dismiss, allowing the suit to proceed toward discovery and trial. A trial date has not been set. The case is now part of MDL No. 1:25-md-03143.

Why This Case Matters

This is the AI copyright case. Its outcome will set the rules for the entire generative AI industry's relationship with copyrighted content. If the Times wins, AI companies face massive liability for training on copyrighted material without licenses. If OpenAI wins on fair use, content creators lose a major tool for controlling how their work is used.

The financial stakes are staggering. OpenAI is valued at over $150 billion. The Times alone seeks billions in damages. Multiply that across every publisher, author, and content creator with similar claims, and you're looking at an existential question for the current AI business model.

For law firms, this case matters on two levels. First, it's a massive source of potential work: copyright litigation, licensing negotiations, content creator representation, and AI company defense. Second, it will define whether the legal content firms produce (briefs, memos, articles) is fair game for AI training. Every firm that publishes anything online has skin in this game.


Lessons for Attorneys

Attorneys advising content creators need to track this case closely. The discovery phase will reveal how AI training datasets are assembled, what role specific copyrighted works play in model outputs, and whether there's a meaningful distinction between 'learning from' and 'copying' content. That factual record will shape advice to clients for years.

Firms advising AI companies should be preparing for a world where licensing is mandatory. Even if fair use ultimately prevails, the legal uncertainty alone is pushing AI companies toward licensing deals. Google, OpenAI, and others have already signed agreements with major publishers. The firms that understand the licensing landscape will capture that work.

For managing partners thinking about their own firm's content: everything you publish online is potentially training data for AI models. Blog posts, case analyses, client alerts, whitepapers. This case will determine whether you have any say in that. In the meantime, review your website's terms of service and consider whether your robots.txt file addresses AI crawlers.


The Bottom Line

NYT v. OpenAI is the case that will decide whether AI companies need permission to train on copyrighted content. Every attorney advising content creators, publishers, or AI companies needs to watch the discovery phase closely, because the facts that come out will reshape copyright law for the AI era.

AI-Assisted Research. This piece was researched and written with AI assistance, reviewed and edited by Manu Ayala. For deeper takes and the perspective behind the research, follow me on LinkedIn or email me directly.