When Sarah Silverman sued artificial-intelligence titans OpenAI and Meta Platforms on July 7, her copyright lawsuits seemed to present a relatively straightforward allegation: These companies didnât secure Silvermanâs and other authorsâ permission before using their copyright works, including her 2010 autobiography The Bedwetter, which isnât okay, per these suits. Silverman is joined by two other authors, novelists Christopher Golden and Richard Kadrey, in these suits; their civil complaints are seeking class-action status which, if green-lit by the court, means that many, many more writers could take action against these companies.
Indeed, OpenAIâs ChatGPT and Metaâs artificial-intelligence projects rely on the mass trawling of books to learn language and generate text, the suits say. Silvermanâs suit contends that these AI projects didnât secure her and other authorsâ permission for using their works before inhaling them, violating intellectual-property law. They also claim that these AI systems gained access to these books via spurious means, using libraries of pirated texts â or as the suitsâ co-attorney Matthew Butterick puts it to Vulture, âCreatorsâ work has been vacuumed up by these companies without consent, without credit, without compensation, and thatâs not legal.â
Silverman claims that ChatGPT and Metaâs generation of text is the very receipt that proves they consumed them. If they can spit out summaries of The Bedwetter and other copyrighted works, her suit contends, then these systems must have used pilfered books to do so. The proposed class action is asking for financial damages as well as âpermanent injunctive reliefâ to stop these AI systems from gobbling down their work â and then using that to create text â without permission or payment.
While these are copyright cases, they might present an opportunity to learn more about the shadowy world of AI as litigation unfolds, and tech-wary catastrophizers such as this author might wonder whether their outcome could actually impact AIâs operation. If a judge or jury decides that ChatGPT canât consume copyrighted material with abandon, will that ruling potentially limit what AI can do? Another way: Could a lawsuit over The Bedwetter thwart a Skynet-like situation? Vulture spoke with Butterick, as well as two experts on law and AI, to learn more about what this litigation can and canât do. Neither OpenAI nor Meta immediately responded to requests for comment.
What exactly is AI doing with copyrighted books thatâs so bad?
Butterick, who is joined in co-leading the suits with attorney Joseph Saveri, said the core issue is that AI isnât just coming up with stuff that just coincidentally happens to sound like The Bedwetter or other books â itâs relying entirely on peopleâs creations. âThese artificial-intelligence systems, itâs kind of an ironic name, because they are built entirely and exclusively on the work of human creators. All of these generative-AI systems rely on consuming massive quantities of human creative work, whether itâs text for these language models or whether itâs images for these AI image generators,â he says. âThatâs how these neural networks function: They take in the training material, and what they do is they try to emulate it. When we talk about artificial intelligence, we have to understand where itâs really from: Itâs human intelligence, [but] itâs just been divorced from the creators.â The lawsuits also claim that the books are gathered from sketchy online sources that donât have the green light to have them in the first place.
âI view this as an existential issue for creators,â says Jacqueline Charlesworth, who repped publishers in litigation against the Internet Archiveâs library-like book-lending system. A judge ruled in March that the Archiveâs book-sharing setup violated copyright law. âWhatâs going on right now is AI suddenly entered popular culture, and tools were made available to everyone, basically, and it seemed like an explosion overnight. Even though we know a lot of these models were being developed over time, [they] exploded.â Thereâs also the issue of whether humans have a right not to be used by AI, be they authors or people generally. âPeople really should have the right to opt out of having their works, their data, used in those models,â Charlesworth argues.
What, if anything, will the suits tell us about AI?
One of the major concerns about AI is the secrecy around how, exactly, platforms like ChatGPT are operating these days. Itâs becoming more and more enmeshed with our daily lives, which means that a lack of specifics about this system could prevent neutral parties from seeing problems or potential dangers, experts told Fast Company in March. For example, AI is known to reflect bad biases â such as racial prejudice â that we see among humans. Without knowing exactly whatâs up with how it learns or picks up potential biases, itâs hard to address it in AI, per University of Michigan-Dearborn professor Samir Rawashdeh.
The discovery phase of Silvermanâs lawsuits could potentially lift the veil on how these systems work. âA discourse thatâs been promoted and pushed by the AI companies themselves is that these systems are essentially magic black boxes and they learn like a human, and there are all these metaphors that are thrown out to essentially dissuade people from scrutinizing how they work,â Butterick says. âAnd by doing so, they are trying to insulate themselves from any kind of legal inquiry. Thatâs part of what these cases are about: Letâs open the black box. Letâs see whatâs inside.â
Charlesworth voices similar sentiments about AI proceedings. âWe are going to learn a lot more about exactly how the models work and what the training data is,â she notes. âThereâs not a lot of transparency there, and I think, particularly if your model is based on pirated books already, thatâs a huge red flag. Youâre copying books without permission. Thatâs infringing.â
Could Silverman win?
Itâs impossible to predict the outcome of any lawsuit. But thereâs some doubt that the case is going to be a slam dunk for authors due to a landmark case involving Google Books nearly a decade ago. The U.S. Supreme Court determined in 2016 that Google Booksâ practice of summarizing texts â and showing excerpts to users â didnât violate copyright law, according to the Associated Press. Deven Desai, a professor of business law and ethics at Georgia Institute of Technology, says that law presently permits the use of books to train software. Desai notes that the Google Books case resulted in âthe ability to use books in transformative ways, including creating snippets and training software in that sense, for machine learning,â so machines can use books to learn under the law.
As for a copyright case staving off an AI revolution? âItâs not really about GPT systems taking over [the world], but about whether they have to pay for their training data.â If OpenAI didnât buy copies of the books, they probably just should have. Perhaps the pen wonât be mightier than the sword in a robot war after all.