In a recent decision, In re OpenAI, Inc. Copyright Infringement Litig., 2025 WL 3635559 (S.D.N.Y. Dec. 15, 2025) District Court Judge Stein ruled, for the first time in AI copyright litigation, that crawling a website in violation of a robots.txs website code does not infringe the anti-circumvention provisions of the DMCA. The New York court also confirmed in a pleadings motion ruling that tenable claims (claims that can survive a pleading motion) can be made out by alleging copyright infringement in AI generated outputs, DMCA claims for intentional removal of copyright management information (CMI), and trademark dilution claims for using famous marks in generated AI output that does not contain content of the trademark owner. The ruling also confirmed prior opinions that claims for unjust enrichment related to training LLMs are pre-empted by the U.S. Copyright Act.
This post analyzes Judge Stein’s opinion in Ziff Davis v OpenAI. It examines how U.S. courts are treating copyright pre‑emption, DMCA anti‑circumvention claims based on the use of robots.txt on websites, DMCA copyright management information claims, trademark dilution, and contributory infringement in the context of LLM training and outputs. Among other things, this blog post addresses these questions:
- Are unjust enrichment claims based on LLM training pre‑empted by the U.S. Copyright Act?
- Can robots.txt qualify as a technological protection measure under the DMCA
- Does ignoring robots.txt constitute circumvention under the DMCA?
- Can LLM providers face contributory infringement liability for outputs that contain copies of works used for training purposes?
- Can LLM providers be liable for trademark dilution?
- Is the opinion in the OpenAI opinion relevant to AI litigation in Canada.
Background
The plaintiff, Ziff Davis, is a publisher of more than 45 digital media publications and internet brands including the well known publications Mashable, CNET, ZNET and PCMAG. It sued OpenAI for copyright and trade-mark infringement and violation of the U.,S. Digitial Millenium Copyright Act (DMCA) arising from OpenAI’s training and operation of its LLMs including its “Generative Pretrained Transformer” or “GPT” series of LLMs.
The pleading alleged, among other things, that OpenAI copied Ziff Davis’s works into its LLM training datasets using existing pools of scraped website content and by directly scraping content from Ziff Davis’s websites by ChatGPT’s webscraping tool, “GPTBot,” despite Ziff Davis’s inclusion of robots.txt files in its websites’ code, which allegedly instructed GPTBot not to scrape Ziff Davis’s websites. The claim further alleged that the outputs of OpenAI’s LLMs sometimes includes content drawn or copied from Ziff Davis’s copyrighted works, and that some outputs of OpenAI’s LLMs misleadingly or inaccurately attribute content to Ziff Davis by the use of its famous marks in output.
Issue 1: Common Law Unjust Enrichment
Ziff Davis alleged that OpenAI has been unjustly enriched by taking Ziff Davis’s copyrighted works and using them to train OpenAI’s models without compensating Ziff Davis. In response, OpenAI contended that Ziff Davis’s unjust enrichment claim is pre-empted by the Copyright Act.
Judge Stein, following prior precedents related to whether unjust enrichment claims related to training LLMs are pre-emptied by the U.S. Copyright Act, agreed that such claims were pre-empted.
Under Section 301 of the Copyright Act, a state law claim is pre-empted when it (1) concerns “works of authorship that . . . come within the subject matter” of the Copyright Act (the “subject matter requirement”) and (2) seeks to vindicate “legal or equitable rights that are equivalent to any of the exclusive rights” protected by Section 106 the Copyright Act (the “general scope requirement”).
The subject matter requirement was clearly met because the subject matter of the dispute related to literary works. The general scope requirement was also met as U.S. courts have generally find that unjust enrichment protects rights that are essentially ‘equivalent’ to rights protected by the Copyright Act. Thus, such claims related to the use of copyrighted material are generally preempted.
In reaching this conclusion, the court followed the prior AI decisions in Doe 1 v. GitHub, Inc., No. 22-cv-6823, 2024 WL 235217, at *7–8 (N.D. Cal. Jan. 22, 2024) (dismissing on preemption grounds claims, including unjust enrichment claims, “principally concern[ing] the unauthorized reproduction of” the plaintiff’s copyrighted works in training of generative AI because the claims “fall under the purview of the Copyright Act”); and Anderson v. Stability AI, 744 F. Supp. 3d 956, 972–73 (N.D. Cal. 2024) (dismissing as preempted claims based on the use of the plaintiffs’ copyrighted works in the training of a generative AI product).
Issue 2: Circumvention of Technological Measures
Ziff Davis implemented robots.txt files in its website codes designed to control access to its copyrighted works. It claimed that these codes were technological protection measures (a TPM) and that OpenAI improperly circumvented theTPMs to gain access to Ziff Davis’s works and thereby violated the DMCA. This claim was also dismissed based on the court’s holding that a robots.txs file was not a TPM under the DMCA and, in any event, there was no circumvention of any TPM.
Under the DMCA an access control TPM is a measure that “effectively controls access to a work”. It does this if the measure, in the ordinary course of its operation, requires the application of information, or a process or a treatment, with the authority of the copyright owner, to gain access to the work.” To “circumvent a technological measure” means to descramble a scrambled work, to decrypt an encrypted work, or otherwise to avoid, bypass, remove, deactivate, or impair a technological measure, without the authority of the copyright owner.
In an important ruling, the court rejected that robots.txs code is a TPM holding that it did not effectively control access to Ziff Davis’s copyrighted works. It was more akin to a “keep off the grass” sign on a lawn. According to the court:
According to the FAC, “robots.txt directives . . . are machine-readable instructions . . . which tell web crawlers which areas of the site the bot is allowed or disallowed from accessing and indexing” and, “[i]n order to bypass a robots.txt ‘disallow’ directive, a scraper must actively and intentionally override these explicit technical directives to access the protected content.” The FAC further alleges that OpenAI “actively ignore[d]” the robots.txt directive that OpenAI’s web crawlers not scrape Ziff Davis’s web pages.
These allegations do not establish that robots.txt files are a “measure designed to thwart unauthorized access” to Ziff Davis’s protected works. LivePerson, Inc. v. 24/7 Customer, Inc., 83 F. Supp. 3d 501, 510–11 (S.D.N.Y. 2015). Robots.txt files instructing web crawlers to refrain from scraping certain content do not “effectively control” access to that content any more than a sign requesting that visitors “keep off the grass” effectively controls access to a lawn. On Ziff Davis’s own telling, robots.txt directives are merely requests and do not effectively control access to copyrighted works. A web crawler need not “appl[y] . . . information, or a process or a treatment,” in order to gain access to web content on pages that include robots.txt directives, 17 U.S.C. § 1201(a)(3)(B); it may access the content without taking any affirmative step other than impertinently disregarding the request embodied in the robots.txt files. The FAC therefore fails to allege that robots.txt files are a “technological measure that effectively controls access” to Ziff Davis’s copyrighted works, and the DMCA section 1201(a) claim fails for this reason.
The court also ruled that OpenAI had not circumvented any TPM. At most “Ziff Davis alleges that OpenAI disregarded the instructions that were contained in robots.txt files. This is not ‘circumvention’ under the DMCA, and Ziff Davis’s claim fails for this reason as well”.
After losing the motion, Ziff Davis attempted to amend its claims to establish that robots.txt files are a “technological measure that effectively controls access” to Ziff Davis’s copyrighted works and that OpenAI circumvented that TPM in violation of section 1201(a) the DMCA. In a decision released three days after the DMCA claims were dismissed, the court in IN RE: OPENAI, INC. COPYRIGHT INFRINGEMENT LITIGATION This Document Relates To: Ziff Davis et al. v. OpenAI et al., 25-cv-4315 2025 WL 3678672 (S.D.N.Y. Dec 18, 2025), dismissed the motion as the amendment would be “futile”.[i]
Issue 3: Distribution of Works with Copyright Management Information Removed
Ziff Davis alleged that OpenAI distributed copies of Ziff Davis’s copyrighted works with the copyright management information (“CMI”) removed in violation of section 1202(b)(3) of the DMCA. OpenAI was unsuccessful in getting this claim struck.
Issue 4: Trademark Dilution
Ziff Davis claimed that its marks are famous and that OpenAI impermissibly diluted those marks by providing the marks in LLM outputs unrelated to Ziff Davis copyright content in violation of 15 U.S.C. § 1125(c). This claim partially succeeded in relation to the Mashable trademark, the only mark that the court found to be famous under the pleading.
Issue 5: Claims for Contributory Infringement and DMCA Section 1202(b)(1)
Ziff David alleged that OpenAI contributed to the infringement by end users of OpenAI’s LLM-based products and that OpenAI intentionally removed CMI when copying Ziff Davis’s works to build its training datasets in violation of the DMCA. The court denied OpenAI’s motion to dismiss those claims for the reasons set forth in the court’s prior opinion in New York Times Company v. Microsoft Corporation, 777 F.Supp.3d 283 (S.D.N.Y. 2025).
In sum, Ziff Davis has stated a claim for contributory infringement because “knowledge of specific infringements is not required to support a finding of contributory infringement” and, by alleging “widely publicized instances of copyright infringement” and “numerous examples of infringing outputs in” the FAC, Ziff Davis plausibly alleged end-user infringement and that OpenAI possessed actual or constructive knowledge of this third-party infringement. Ziff Davis states a claim under section 1202(b)(1) of the DMCA because the FAC alleges a harm that bears a close relationship to traditional copyright infringement, which is sufficient to satisfy the injury-in-fact requirement of Article III of the U.S. Constitution, and because the FAC alleges that OpenAI removed CMI from copies of Ziff Davis’s works with actual or constructive knowledge that such removal could facilitate end-user infringement.
Comments and takeaways
The opinion in Ziff Davis is important for a number of reasons including the following:
- It confirms prior decisions that under U.S. law, copyright pre‑emption applies to state law claims of unjust enrichment based on AI training. Similar claims of unjust enrichment are being made in Canadian litigation by publishers in the Toronto Star v OpenAI case in Ontario.
- It held that that robots.txt is not a DMCA access‑control TPM and that disregarding robots.txt directives in website code is not circumvention under the DMCA. The publishers in the Toronto Star v OpenAI case in Ontario have made similar claims against OpenAI.
- The opinion confirms prior decisions in the Getty and Cohere cases.
- The opinion recognizes that an LLM provider may be liable for DMCA claims in the United States where its output does not include CMI information originally associated with a work used for training or removes CMI from copies of works used for training with actual or constructive knowledge that such removal could facilitate end-user infringement.
- The Canadian Copyright Act was amended by the Copyright Modernization Act in 2012 to provide copyright holders with rights in Rights Management Information (a.k.a., RMI or CMI) and legal protections for TPMs to ratify the WIPO Internet Treaties. The Canadian protections for RMI and TPMs have similarities to the CMI and TPM provisions in the DMCA. It is therefore conceivable that a Canadian court would refer to the U.S. cases in any infringement claim in Canada related to RMIs and TPMs.
Endnote
[i] According to the court:
The PSAC’s allegations, taken as true, establish that robots.txt files are not a “technological measure that effectively controls access to” Ziff Davis’s copyrighted works. The PSAC thus fails to state a claim pursuant to section 1201(a) of the DMCA, and Ziff Davis’s proposed amendment would be futile.4
The PSAC alleges that, “[t]o protect content from scraping and other bot activity, websites use robots.txt directives, which are machine-readable instructions in source text files hosted on the website’s server that follow the Robots Exclusion Protocol (‘REP’) syntax, and which tell web crawlers which areas of the site the bot is allowed or disallowed from accessing and indexing.” (PSAC ¶ 120.) “REP is a universal technical standard used by website owners to direct and govern the activity of web crawlers accessing their sites.” (Id. ¶ 121.) “Reputable companies that operate crawlers … configure their bots to automatically access and read the robots.txt code at the website’s root domain before initiating any crawling or scraping activity and, in turn, automatically process and effectuate the REP.” (Id. ¶ 122.) “[A]utomated bots will comply with those mandates unless they have been affirmatively configured to ignore or bypass the REP.” (Id.) “In its default configuration, an REP-compliant bot is programmed such that it cannot scrape websites that disallow crawling” through the use of robots.txt files. (Id. ¶ 124.) “OpenAI publicly promoted and privately encouraged website owners to use robots.txt instructions to prevent unauthorized crawling by its GPTBot” (id. ¶ 134) but, after OpenAI employees found that GPTBot’s compliance with robots.txt directives hindered their work (id. ¶¶ 125, 129–31), OpenAI “affirmatively reconfigured its bot to bypass robots.txt code on websites” and resumed scraping Ziff Davis’s websites despite those sites’ robots.txt files (id. ¶ 144).
On the PSAC’s telling, robots.txt files prevent a bot from accessing Ziff Davis’s works only if the creator of the bot affirmatively elects to configure the bot to read and comply with robots.txt files before the bot scrapes the website. If the bot creator does not take this affirmative step of choosing to heed the requests embodied in Ziff Davis’s robots.txt files, the robots.txt files have no effect on the bot’s ability to access Ziff Davis’s works. The allegation that all “reputable” bot operators configure their bots to comply with robots.txt files by default (id. ¶ 122) does not change the precatory nature of the requests embodied in robots.txt files. According to the PSAC, robots.txt files do not, “in the ordinary course of [their] operation, require[ ] the application of information, or a process or a treatment, with the authority of [Ziff Davis], to gain access to [Ziff Davis’s] work[s].” 17 U.S.C. § 1201(a)(3)(B). To the contrary, robots.txt files require an affirmative action by the creator of a bot for the files to impede access to Ziff Davis’s works. The PSAC therefore establishes that robots.txt files are not a “technological measure that effectively controls access to” Ziff Davis’s works within the meaning of the DMCA.