top of page

Training on Thrones: Winter is Coming for Originality

Image by HBO
Image by HBO

On October 27, 2025, U.S. District Judge Sidney Stein sent a chill through Silicon Valley: winter is coming for generative AI. In a pretrial Order & Opinion denying OpenAI’s motion to dismiss In re OpenAI, Inc. Copyright Infringement Litigation, Judge Stein held that the class actions’ allegations that OpenAI infringed plaintiffs’ works “satisfy the elements of a prima facie claim of infringement as to at least some outputs of ChatGPT.” Specifically, the class asserted that OpenAI downloaded and reproduced works to create infringing derivative works such as summaries and imagined sequels to popular fiction, including George R.R. Martin’s A Song of Ice and Fire series, most notably, A Game of Thrones. 

The ruling contrasts with Judge Stein’s earlier position in The New York Times Company v. Microsoft Corp., where he found that AI summaries of news articles were not “substantially similar”, highlighting a double standard. While human “borrowing” in art is often celebrated, similar “borrowing” by AI is condemned. Judge Stein’s finding marks a critical point for how to reconcile the protection of authors with the new reality of AI’s creative use of copyrighted material. This article argues that a compulsory licensing system, modeled on the music industry’s licensing infrastructure, offers a viable solution.


The Class Action Against OpenAI

In In re OpenAI, Inc. Copyright Infringement Litigation, Judge Stein allowed a first-of-its-kind lawsuit against OpenAI and Microsoft to proceed, holding that plaintiff-authors had plausibly alleged ChatGPT’s summaries of their novels crossed into protected expression. Judge Stein concluded that “a more discerning observer could reasonably conclude that the allegedly infringing outputs are substantially similar to [George R.R.] Martin’s original work.” ChatGPT summaries, he noted, “convey the overall tone and feel of the original work by parroting the plot, characters, and themes of the original.” 

In a footnote, Judge Stein distinguished this case from the N.Y. Times case. In the N.Y. Times case, AI-generated article summaries restated unprotected facts and differed “in style, tone, length, and sentence structure.” However, in the current class action, the outputs allegedly incorporated protectable narrative components.


Photo by Dima Solomin on Unsplash
Photo by Dima Solomin on Unsplash

Background

The authors’ lawsuit forms part of a broader surge of copyright litigation triggered by generative AI. Beginning in September 2023, prominent writers, including George R.R. Martin, joined the Authors Guild in filing class actions alleging OpenAI copied their novels without authorization to train ChatGPT. These cases, alongside actions by The New York Times and other publishers, were consolidated before Judge Stein in April 2025. 

In the N.Y. Times case, plaintiffs allege OpenAI “scraped directly from The Times’s websites” and “removed copyright-management information from The Times’s works in violation of 17 U.S.C. § 1202(b)(1),” allowing ChatGPT to generate summaries or even near-verbatim reproductions of Times content that, they contend, can circumvent paywalls. There, the AI outputs were article summaries conveying facts that “differed in style, tone, length, and sentence structure” from the originals.

By contrast, in the present action, the authors allege that ChatGPT’s outputs incorporate protectable narrative elements, such as plot, setting, and characters. As a result, Judge Stein allowed the contributory infringement claims to proceed. In a separate opinion in The New York Times Co. v. Microsoft Corp. and the related news-publisher actions, Judge Stein rejected OpenAI’s statute-of-limitations defense as to specific direct infringement claims linked to model training and noted that The Times had attached more than one hundred pages of allegedly infringing output examples in Exhibit J to its complaint, which he found sufficient at the pleading stage to allege third-party infringement for contributory liability. Simultaneously, in that same opinion, the Court dismissed the Center for Investigative Reporting’s “abridgment” summary output claims as not substantially similar and narrowed parts of the Digital Millennium Copyright Act (DMCA) claims, emphasizing that factual reporting receives thinner protection than purely expressive works. Thus, taken together, Judge Stein’s rulings draw a line: detailed retellings of fictional stories can cross into infringement, whereas retellings of news events generally do not. 

Meanwhile, OpenAI insists its use of public text is transformative fair use, likening AI training to a more advanced search engine indexing the web. OpenAI argues ChatGPT does not store entire books or articles verbatim, but “learns” from them to generate new text. Until now, whether an AI’s outputs infringe copyright had largely remained untested. Judge Stein’s order is the first to hold that a large language model’s responses can potentially violate copyright, especially when it retells someone’s fictional story without authorization. The decision sets a new frontier for how far AI can go in mimicking creative works, and under what conditions. 

When Do AI Outputs Infringe? 

The dispute forces courts to reconcile traditional copyright doctrine with AI’s new capabilities. Under the Copyright Act, authors hold exclusive rights to reproduce their work, create derivatives, and distribute copies. To prove infringement, a copyright owner must show: (1) the defendant actually copied the plaintiff’s work, and (2) the copying appropriated protected expression, meaning the two works are substantially similar in protectable elements.

In the context of generative AI, the first prong, actual copying, is unusual. Here, the Authors Guild plaintiffs in In re OpenAI, Inc. Copyright Infringement Litigation allege that OpenAI ingested their novels to train its model, a claim distinct from the news-summary allegations raised in the New York Times case. OpenAI does not dispute using these texts in training; instead, the litigation centers on the second prong: whether ChatGPT’s outputs are “substantially similar” to the protectable elements of the author’s books. 

Judge Stein determined that this question could go to a jury. Because the authors’ works are highly expressive and contain fictional characters, plot, and dialogue, Judge Stein applied the “more discerning observer” test. This filters out unprotectable elements and asks if a reasonable juror could find substantial similarity in the protected aspects. The answer was yes: even though the AI-written summaries do not copy every detail, they “most certainly [are] attempts at abridgment or condensation of some of the central copyrightable elements of the original works such as setting, plot, and characters,” essentially creating unauthorized abridged versions.

One ChatGPT output, for example, summarized Martin’s A Game of Thrones in a few paragraphs, recounting the major storylines and characters. Another output outlined a hypothetical sequel in “Westeros,” the fictional continent where much of Martin’s series takes place, including new characters and plot twists in Martin’s world. These AI-generated texts “include many specific details” from Martin’s books and mimic their narrative arc. In Stein’s view, that amounts to copying protected expression, enough to plausibly infringe if not justified by a defense. 

Fair Use and the Limits of “Transformation”

A likely next step for OpenAI will be to invoke the fair use doctrine, which allows the use of copyrighted material so long as it constitutes “fair use.” Under section 107 of the Copyright Act, whether a use of copyrighted material constitutes “fair use” is evaluated based on four factors: (1) the purpose and character of the use, including whether such use is of commercial or nonprofit educational; (2) the nature of the copyrighted work; (3) the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and (4) the effect of the use upon the potential market for or the value of the copyrighted work.

Courts have treated uses such as criticism and commentary, news, teaching material, and research as presumptively fair use. In Campbell v. Acuff-Rose, the Supreme Court found 2 Live Crew’s parody of “Pretty Woman” constituted fair use because it transformed the original song’s meaning and had a different target market. Campbell demonstrates that even commercial copying can be fair use if the use is substantially transformative and satisfies the market impact analysis.

More recently, in Authors Guild v. Google, Inc., the Second Circuit addressed whether Google’s creation of digital copies of copyrighted books to create a search index constituted fair use. The court found that this constituted a transformative use because it took the copyrighted work and enabled users to use it as data for searching for specific terms (notably, Google only allowed users to see a snippet of the copyrighted books). The court held that this was a transformative use because it repurposed the books as searchable data, a different goal and audience than the original works.

Unlike the parody in Campbell or the research use in the Authors Guild, this case alleges that ChatGPT’s outputs are not commentary or new art; they are summaries and sequels that serve as replacements for the original books. Judge Stein held that, at the pleading stage, the outputs are sufficiently similar to the original novels to constitute prima facie infringement. Deciding later whether this constitutes a transformative fair use will require a fact-intensive analysis of purpose and market effect. 


  Photo of 2 Live Crew by Luke Records
  Photo of 2 Live Crew by Luke Records

The Double Standard in Creative Borrowing. 

This ruling exposes a broader double standard in how we view artistic “borrowing.” Human creators have always borrowed from each other’s work, and society often celebrates it. For example, in music, sampling has been a creative way to take pieces of existing work and transform them into something new. Hip-hop and pop producers frequently take snippets of existing songs and remix them into new works.

Take Kanye West. His albums “epitomize the artistic creativity [. . .] of sampling,” building hit songs from soul, rock, and electronic samples woven together. By manipulating and reassembling classic tracks—from Chaka Khan’s vocals in “Through the Wire” to Daft Punk’s electronic beat in “Stronger”—West created new art that resonated with millions. Critics often praise such sampling as inventive, even as the practice walks a fine line with copyright law. In fact, Kanye has faced numerous lawsuits over the use of unlicensed samples throughout his career. Yet, instead of branding him a thief, the music industry largely legitimized his borrowing through licensing, sample clearance, and royalty payments, ensuring that the original artists are compensated. 

The cultural tolerance for borrowing changes when AI is involved. For example, Game of Thrones fan fiction rarely faces backlash and is viewed as a creative and inspirational endeavor. However, if ChatGPT creates a prequel to the series, engaging in similar borrowing to human fan fiction authors, the reaction is far less forgiving—a “Stark” double standard. This shows the ever-growing legal dilemma of whether machines should be held to a stricter standard than human authors. Machine borrowing is not as sympathetic, arguably because there is a higher degree of mistrust that machines could divert traffic or revenue from the original authors, more so than human “borrowers.” The current case makes it ever more pressing that the law decides whether this double standard will continue for AI borrowing or whether there will be an equitable and consistent system for human and machine borrowers.

Thus, the law faces a challenge. Can it hold machines to a stricter rule than people, or must it find a consistent principle for all creative borrowing?

From “Thou Shalt Not Steal” to Licensing

History suggests that outright condemnation of new art forms eventually gives way to accommodation. The music industry’s battle over sampling offers a telling parallel. Early on, courts took a hard stance against digital sampling. In 1991, the opinion in Grand Upright Music Ltd. v. Warner Bros. Records, Inc., one of the first cases to deal with digital sampling, begins with the phrase, “‘Thou shalt not steal.’” (quoting Exodus 20:15). A decade later, the Sixth Circuit in Bridgeport Music, Inc. v. Dimension Films, went so far as to declare that no de minimis defense exists for digital samples. At issue in Bridgeport was a two-second guitar riff comprised of an arpeggiated chord, “three notes that, if struck together, comprise a chord but instead are played one at a time in very quick succession,” from a funkadelic song, “Get Off Your Ass and Jam.” The chord was sampled in N.W.A.’s “100 Miles and Runnin,” which was included in the soundtrack of the movie I Got the Hook Up. The court held that no matter how much of a musical work is sampled, it can never be defined as insubstantial enough not to grant copyright protection. The Sixth Circuit’s strict stance could have chilled the vibrant art of music sampling entirely. Yet, instead of snuffing out creativity, those decisions prompted an industry adaptation–licensing and sample clearance. Producers now clear samples in advance and route royalties back to the original rightsholders. Within that system, Kanye West is free to produce sample-heavy music, but only because those samples are cleared and the original artists get paid. By contrast, OpenAI trained its models on millions of books and articles before any comparable licensing system existed for literary works. Hence, training and outputs occur at scale with no built-in royalty stream for original authors.  

A remedy for the use of copyrighted music is the statutory mechanical compulsory license, which allows anyone to make and distribute phonorecords of a previously released nondramatic musical work, so long as they comply with the statute and pay the prescribed royalty. Beyond that, the music industry has built a broader licensing regime in which users of copyrighted works, whether for sampling, streaming, or use in a public venue, pay to borrow the music. Collective rights organizations, like the American Society of Composers, Authors and Publishers (ASCAP), Broadcast Music, Inc. (BMI), and the Mechanical Licensing Collective (MLC), collect the royalties and distribute the compensation from the royalties, usually to the songwriter and the record label. This system strikes a balance between “protecting an artist’s interest and depriving other artists of the building blocks of future works.” Bridgeport Music, Inc. v. Dimension Films, 401 F.3d 647 (6th Cir. 2004).

Generative AI lacks a similar framework. Generative AI essentially samples creative work. If every AI summary or stylistic mimicry is treated  as presumptive infringement, there is a risk of a regime like “Thou shalt not steal.” But, outright prohibition of AI “sampling” is neither practical nor culturally desirable in the long run, just as banning musical sampling would have stifled an art form. Therefore, a better solution is necessary.


 Image by HBO
Image by HBO

    

Proposed Solution 

Just as the music industry has used collective licensing and data royalties to solve the dispute between creativity and copying, literature should follow suit in light of the generative AI boom. Creating a comparable collecting system like ASCAP and BMI can allow AI developers to continue to have access to creative literary data for training and subsequent output, while allowing authors to retain compensation. This can be accomplished through a voluntary system in which AI firms join a licensing collective (similar to the music industry) or through the legislative process, with Congress passing statutory licensing for AI. Recently, Anthropic settled in a class-action suit by authors for $1.5 billion, a massive indication that “Winter is coming” for AI models that ignore creators. A licensing regime could curb the endless litigation that will likely continue to follow.

There are key differences between the music industry licensing regime and AI’s use of literary works that would need to be addressed to make this system effective. For example, having a single company control all the data used to train AI models could raise antitrust concerns. Additionally, AI models train on large amounts of data, which can make the source of that data and potentially infringing content more difficult to track. Moreover, the user of the infringing data can also be harder to track than a person who streams a piece of music. However, these logistical issues are not fatal flaws, and copyright law will have to evolve as it has with prior technological innovation.

Conclusion

Judge Stein’s ruling marks a doctrinal turning point. It places generative AI within the reach of copyright, rejecting the convenient fiction that training and outputs lie outside authors’ exclusive rights. Judge Stein’s recognition forces lawmakers and industry to choose between two paths: endless, case-by-case litigation, or a licensing architecture that treats creative works as paid inputs rather than free raw material. A statutory or collective licensing regime, modeled on music sampling, would realign incentives by requiring models built on authors’ worlds to fund the labor that sustains them. If “winter is coming” for generative AI, it should arrive as a “Wall” of enforceable rights, and crossing it should require lawful licensing rather than uncompensated appropriation.    


*The views expressed in this article do not represent the views of Santa Clara University.

bottom of page