What Do AI Copyright Cases Tell Us About the Future of AI Regulation?

“Recent decisions give us some guidance about how to frame AI regulation, but only partial guidance.” – Vincent Bergeron
If there’s a narrative emanating from the host of copyright suits against artificial intelligence (AI) companies across the globe, including 40 in the United States (U.S.) alone, it doesn’t augur well for a harmonised international regulatory scenario.
“So far, the decisions are telling us that you’ll likely have to look at individual jurisdictions if you want guidance for companies that train models accessing copyrighted data globally,” says Vincent Bergeron, leader of the emerging technologies group at Montreal-based Robic LLP, a member of the IPH Limited Group.
Courts are currently using traditional principles to adjudicate cases involving copyrighted material used for AI training. And while there may be similarities—or at least the appearance of similarities— in some of the legal principles cited by courts in different jurisdictions, the way these principles are applied can produce startlingly different results.
“In the U.S., for example, AI companies rely on the fair use exception to justify accessing copyrighted material to train their models,” Bergeron says. “In Canada, we have a sound-alike principle called ‘fair dealing’, but it’s not applied in the same way, so the very same facts can produce different results in the two countries.”
Differences aside, however, a close look at U.S. jurisprudence is warranted.
“What happens in the U.S. always affects the rest of the world,” Bergeron says.
The difficulty for regulators seeking guidance, however, is that the caselaw to date is hardly consistent.
Thomson Reuters v. Ross Intelligence, released in February, was the first major American decision on copyright liability in generative AI training. Judge Stephanos Bibas of the Third Circuit for the District of Delaware ruled that Ross had infringed Thomson Reuters’s (TR) copyright when it used Westlaw’s legal headnotes to train its legal research AI product.
Ross had tried to license the headnotes, but TR had refused. So, Ross acquired legal memos based on Westlaw’s headnotes from a third party and used them.
Bibas reasoned that because the memos were substantially similar to the Westlaw headnotes and Ross had copied the memos as is, Ross had infringed TR’s copyright. The fact that the headnotes did not appear in Ross’s AI end product did not obviate the fact that they had been copied for commercial purposes, albeit at an intermediate stage. And Ross’s fair use defence failed largely because Ross’s commercial and excessive use negatively affected the market for Westlaw headnotes.
More recent cases, however, from different judges in the Northern District of California District Court, have held that reproducing copyrighted works to train large language models (LLMs) is fair use. But the consistency in the results at which the courts arrived in Bartz v Anthropic and Kadrey v. Meta masks serious divergence in their analysis on three key issues: whether training an LLM is a distinct transformative purpose; whether the use of pirated copies is significant; and whether and to what extent the reproduction has an effect on the market.
The upshot is that authorities looking to the courts for guidance on regulating AI’s use in training LLMs and protecting copyright will not find definitive answers in the judgments to date.
“These recent decisions give us some guidance about how to frame AI regulation, but only partial guidance,” Bergeron says.
As well, there is little consensus among academics, the industry, and politicians as to whether or how much AI sector regulation is necessary. The debate, as always, centres around striking a balance between driving innovation and ensuring equitable access to technology.
All of that is not to say that AI regulation is at a standstill.
Leading the pack is the European Union, which enacted its Artificial Intelligence Act, Regulation (EU) 2024/1689(EU AI Act), which took effect in August 2024, with its rules to be phased in over three years. The legislation requires LLMs to comply with copyright law, provide detailed summaries of the content they access for training purposes, avoid pirated data, respect paywalls and minimise infringing outputs. Creators have the right to limit access to their data.
Otherwise, the United Kingdom (U.K.) government is seeking consultation on proposals to change current laws which exempt copyright protection for text and data mining for non-commercial use; Japan allows access to datasets for training AI models; and Singapore permits unauthorised access to copyrighted content for computational data analysis for non-commercial purposes as well as for AI training purposes for those who have lawful access.
Canada’s first attempt at regulating AI was the Artificial Intelligence and Data Act (AIDA). The legislation died in committee when then-Prime Minister Trudeau prorogued the Canadian Parliament earlier this year. Although AIDA did not specifically address copyright issues, the previous government did consult on copyright and generative AI. These consultations explored the need for rights-holder consent for AI training and liability for AI-generated output that infringed copyrights.
Even in the anti-regulation-minded U.S., where the Trump administration’s “One Big Beautiful Bill” originally contained a 10-year moratorium on states’ rights to regulate AI (withdrawn under bipartisan pressure from the final version), there is movement to regulate LLMs’ access to copyrighted works.
In mid-July, the U.S. Senate Judiciary Committee’s Subcommittee on Crime and Counterterrorism held a hearing titled “Too Big to Prosecute?: Examining the AI Industry’s Mass Ingestion of Copyrighted Works for AI Training.” Josh Hawley, the Subcommittee’s Republican chair, called the practice “the largest IP theft in American history” and opposed the idea that the courts, as opposed to Congress, should deal with the issue.
Within a week, Hawley and Democrat Richard Blumenthal introduced the bipartisan AI Accountability and Personal Data Protection Act, which would prohibit AI concerns from using copyrighted works for training purposes without the authors’ permission.
But two days later, the Trump Administration released Winning the AI Race: America’s AI Action Plan”. The plan runs directly contrary to the policies proposed by the Biden administration for managing AI risks, instead aiming to make it easier for AI companies to operate. It also penalises states whose AI regulations the federal government considers too burdensome (instead of outrightly banning AI regulation as proposed in the original legislation).
So, whether and when further regulation governing the relationship between AI and copyright will come to fruition in the U.S. or elsewhere is anyone’s guess.
“It’s hard to predict,” Bergeron says. “The debate used to be about the need for regulation and ethical considerations, but it now feels political in a way that it wasn’t previously.”