B.C. author leads 'David against Goliath' lawsuits alleging big tech used writers' works to train AI
Claims allege artificial intelligence companies are illegally mining copyrighted books of Canadian authors

A best-selling Vancouver author has launched a class-action lawsuit against Nvidia, claiming the multi-trillion dollar tech company illegally used his and other Canadian writers' works to train artificial intelligence large language models (LLM).
J.B. MacKinnon is named as the representative plaintiff in the claim, which says his books, The 100-Mile Diet and The Once and Future World, were part of a 196,640-book dataset that Nvidia used without paying a licensing fee or securing consent to acquire or use the works.
"This isn't a situation where some copyrighted material appears as a small fraction of the larger process. The models were entirely built on the mining of copyrighted work," MacKinnon told CBC.
"The most disturbing aspect of it is that those large language models and AIs can then compete with human writers, and are likely to displace human writers."
Besides Nvidia, MacKinnon is the representative plaintiff in three similar class actions filed in B.C. Supreme Court that individually name Meta/Facebook, Anthropic and Databricks Inc. as defendants. All four class actions will require court certification to move forward.
The class, or group, of plaintiffs described in each of the cases are all holders of Canadian copyright in works the companies used to build their LLMs.
Large language models are AI software designed to comprehend and generate language that mimics a knowledgeable human.
World's largest company
According to the claim, Nvidia trained its LLMs on books it obtained in a copied dataset "...because it believed doing so would improve its model and give it an advantage over its competition."
"NVIDIA monetized the NVIDIA LLMs by using them to assist in the growth and development [of the] company's position in the AI industry, which has in turn led to NVIDIA's growth into the world's largest company by market capitalization," it says.

Nvidia has a market capitalization of $4.28 trillion.
"Collectively as Canadian writers, we're certainly the David against the Goliath in this case," said MacKinnon. "These are the most powerful and richest corporations in the world that we're up against. I don't think we have any reason to think that the fight will be an easy one."
CBC contacted Nvidia but a company spokesperson declined comment.
The claims also allege the four companies took steps to conceal copyright infringement by training the LLMs to respond in a "misleading way" when asked if copyrighted material was used in the LLMs' creation.
Additionally, the claims say the companies removed copyright management information before the books fed into the LLMs "...so that the LLM did not itself learn that it was built off copyrighted material."
"If that proves to be true in court, I hope that the courts will consider that cause not only for writers to be compensated, but for the companies to be punished for bad behaviour," said MacKinnon.

A judge in San Francisco hearing a similar case brought by authors against Anthropic sided last month with the AI company, ruling that training LLMs on purchased copyrighted books qualifies as "extremely transformative" under the legal definition of fair use.
However, the judge did say that Anthropic's use of millions of books it allegedly pirated was a separate issue to be considered.
A lawyer representing MacKinnon said the problem of AI companies using the original works of authors to build highly profitable products is an issue that's gaining attention worldwide.
"The goal of the companies is not to transform the world, it's to make money," said Reidar Mogerman.
"I think you can respect both the values of the copyright system and the ability of these companies to create these models.... It's just that you can't throw out one to create the other, especially when the thing you create is going to be a competitive threat to the work the authors did."