In recent months, U.S. federal courts have delivered pivotal decisions regarding the use of copyrighted books in training artificial intelligence (AI) models, offering nuanced interpretations of the fair use doctrine in the context of AI development.
In June 2025, Judge William Alsup of the Northern District of California ruled that Anthropic’s utilization of copyrighted books to train its AI model, Claude, constituted fair use. The court emphasized the transformative nature of the training process, noting that it did not involve copying the expressive content of books for consumption or redistribution but instead used the material to extract statistical patterns and relationships that enabled the model to generate new outputs. This function was fundamentally different from the original purpose of the works and aligned with how the fair use doctrine protects learning and innovation. However, the court distinguished this from Anthropic’s creation of a central digital library using pirated books, which it found did not qualify as fair use, leading to a trial scheduled for December 2025 to assess damages. ([ballardspahr.com](https://www.ballardspahr.com/insights/alerts-and-articles/2025/07/novel-ruling-offers-framework-for-fair-use-of-copyrighted-material-for-training-ai-systems?utm_source=openai))
Similarly, Judge Vince Chhabria, also of the Northern District of California, addressed a case involving Meta Platforms. The plaintiffs alleged that Meta employed libraries of copied books to train its Llama large language models. Judge Chhabria ruled that, in the absence of meaningful evidence of market dilution from the authors, the copying and training were fair use. ([whitecase.com](https://www.whitecase.com/insight-alert/two-california-district-judges-rule-using-books-train-ai-fair-use?utm_source=openai))
These decisions underscore the courts’ recognition of the transformative potential of AI training processes under the fair use doctrine. However, they also highlight the importance of the methods used to acquire training data. While the courts have been receptive to the argument that AI training can be a transformative use of copyrighted material, they have drawn clear lines regarding the legality of data acquisition methods, particularly concerning the use of pirated content.
These rulings provide a framework for AI developers, emphasizing the necessity of lawful data acquisition and the transformative application of copyrighted materials in training AI models. As the legal landscape continues to evolve, these cases will likely serve as benchmarks for future disputes at the intersection of AI development and copyright law.