Sarah Silverman Hits Stumbling Block in AI Copyright Infringement Lawsuit Against Meta

From The Hollywood Reporter:

A federal judge has dismissed most of Sarah Silverman‘s lawsuit against Meta over the unauthorized use of authors’ copyrighted books to train its generative artificial intelligence model, marking the second ruling from a court siding with AI firms on novel intellectual property questions presented in the legal battle.

U.S. District Judge Vince Chhabria on Monday offered a full-throated denial of one of the authors’ core theories that Meta’s AI system is itself an infringing derivative work made possible only by information extracted from copyrighted material. “This is nonsensical,” he wrote in the order. “There is no way to understand the LLaMA models themselves as a recasting or adaptation of any of the plaintiffs’ books.”

Another of Silverman’s arguments that every result produced by Meta’s AI tools constitutes copyright infringement was dismissed because she didn’t offer evidence that any of the outputs “could be understood as recasting, transforming, or adapting the plaintiffs’ books.” Chhabria gave her lawyers a chance to replead the claim, along with five others that weren’t allowed to advance.

Notably, Meta didn’t move to dismiss the allegation that the copying of books for purposes of training its AI model rises to the level of copyright infringement.

The ruling builds upon findings from another federal judge overseeing a lawsuit from artists suing AI art generators over the use of billions of images downloaded from the Internet as training data. In that case, U.S. District Judge William Orrick similarly delivered a blow to fundamental contentions in the lawsuit by questioning whether artists can substantiate copyright infringement in the absence of identical material created by the AI tools. He called the allegations “defective in numerous respects.”

Some of the issues presented in the litigation could decide whether creators are compensated for the use of their material to train human-mimicking chatbots that have the potential to undercut their labor. AI companies maintain that they don’t have to secure licenses because they’re protected by the fair use defense to copyright infringement.

According to the complaint filed in July, Meta’s AI model “copies each piece of text in the training dataset” and then “progressively adjusts its output to more closely resemble” expression extracted from the training dataset. The lawsuit revolved around the claim that the entire purpose of LLaMA is to imitate copyrighted expression and that the entire model should be considered an infringing derivative work.

But Chhabria called the argument “not viable” in the absence of allegations or evidence suggesting that LLaMA, short for Large Language Model Meta AI, has been “recast, transformed, or adapted” based on a preexisting, copyrighted work.

Another of Silverman’s main theories — along with other creators suing AI firms – was that every output produced by AI models are infringing derivatives, with the companies benefiting from every answer initiated by third-party users allegedly constituting an act of vicarious infringement. The judge concluded that her lawyers, who also represent the artists suing StabilityAI, DeviantArt and Midjourney, are “wrong to say that”  — because their books were duplicated in full as part of the LLaMA training process — evidence of substantially similar outputs isn’t necessary.

Link to the rest at The Hollywood Reporter

6 thoughts on “Sarah Silverman Hits Stumbling Block in AI Copyright Infringement Lawsuit Against Meta”

  1. Speaking of writers suing over LLMs:

    https://www.msn.com/en-us/money/other/openai-and-microsoft-sued-by-nonfiction-writers-for-alleged-rampant-theft-of-authors-works/ar-AA1kjyvr

    “A new class action lawsuit was filed against OpenAI and Microsoft on Tuesday, alleging that the companies have trained AI chatbot ChatGPT and its later versions on copyrighted materials from nonfiction authors’ works and academic journals without their consent.”

    Academic journals, no less.
    I wonder how many were documenting taxpayer funded “research”. Or who is actually funding the lawsuit.

  2. Copyright is granted for a purpose:

    “[the United States Congress shall have power] To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.”

    Copyright is not ansolute:

    “Limitations and exceptions to copyright are provisions, in local copyright law or the Berne Convention, which allow for copyrighted works to be used without a license from the copyright owner.

    Limitations and exceptions to copyright relate to a number of important considerations such as market failure, freedom of speech,[1] education and equality of access (such as by the visually impaired). Some view limitations and exceptions as “user rights”—seeing user rights as providing an essential balance to the rights of the copyright owners. There is no consensus among copyright experts as to whether user rights are rights or simply limitations on copyright. The concept of user rights has been recognised by courts, including the Canadian Supreme Court,[2] which classed “fair dealing” as such a user right. These kinds of disagreements in philosophy are quite common in the philosophy of copyright, where debates about jurisprudential reasoning tend to act as proxies for more substantial disagreements about good policy.”

    https://en.m.wikipedia.org/wiki/Limitations_and_exceptions_to_copyright

    Finally:

    “Notably, Meta didn’t move to dismiss the allegation that the copying of books for purposes of training its AI model rises to the level of copyright infringement.”

    First, the plaintiffs have to prove the books actually were copied.

    By not challenging the claim, the META lawyers don’t admit copying happened.
    Smart.

    The onus is on the plaintifs is to prove *their* books were copied, that the *product* infringes their copyright (fviolates fair uses) and *doesn’t* advance the progress of science and “useful arts” (tech, in modern parlance) or education. As I’ve said before, the gist of these claims boils down to “somebody is making money off my books and it’s not me.” That by itself does not equal to copyright violation.

    Anybody can make any allegation and they can file a lawsuit over anything.
    But filing doesn’t mean they have a worthy legal case.

    So far the lawsuits are being received skeptically because existing precedents on adjacent claims run counter to the allegations. If anything, the judges are being kind by giving them a chance to refile. The kindness won’t last forever.

    If the suits keep on piling up some judge is going to be less than kind.

  3. The more I learn about AI, the more I believe it is wrong for it to train on original artwork or writing without permission. For example, an author who creates an original world such as in a fantasy, with its own vocabulary and societal constructs, or who produces innovative ideas for murder mysteries or historical novels, may not immediately see copying. But these highly personal and original twists and turns will be made part of a general pool of material that will respond to user prompts without credit. Not all of these references will be so famous as to be readily identifiable. The process threatens to cheapen creativity and steal from authors (and I believe this is even more pronounced in the case of artists) on a mass scale.

    I realize this may seem like an overreaction to some. But as an example, I believe that as time passes we will stumble across, say, more and more books (fictional or not) with spiders that weave “Sum Pig” into their webs, until future generations mostly forget that Charlotte was a specific fictional invention. Maybe that’s fine with some people. Not with me.

    • I guess that doesn’t break copyright to me because human authors homage other authors without credit all the time. That’s where culture and cliche comes from, this endless reuse, repackaging, and inspiration. Some works are practically direct replies to others, noticeable to anyone familiar with both and going right over the head of anyone not. That happens with or without AI.

    • You may be right, but if that is the case we should have new laws to prevent that.

      copyright and the purpose copyright was created for, does not seem to fit. Creative people thank others for being their inspiration all the time.

    • I suspect the author took language from the existing languages, and social constructs from the surrounding society. I also suspect the author read many other authors, and learned what a novel was from them, learned about characterization from them, and learned about plots from them.

      And the murder mystery? How many did the author read before claiming hers was an original creation? How did the author train herself without absorbing all kinds of things, including things covered by copyright?

      What author has written a book not based on the world in which she lives?

Comments are closed.