Generative AI vs. Copyright

From Publishers Weekly:

The balance between copyright and free speech is being challenged by generative AI (GAI), a powerful and enigmatic tool that mimics human responses to prompts entered into an internet search box. The purpose of copyright law, according to the U.S. Constitution, is “to promote the Progress of Science and useful Arts, by securing to Authors and Inventors the exclusive Right to their exclusive writings.” The problem is that GAI’s ability to incentivize progress and innovation threatens the entertainment industry’s dependence on copyright to protect creative works.

Copyright law strikes a balance between those who create content and the public’s interest in having wide access to that content. It does this via granting authors a limited monopoly over the dissemination of original works by giving them the exclusive right to reproduce, distribute, and create derivative works based on copyrighted material. However, the concept of exclusive rights doesn’t really apply to artificially intelligent robots and computers scraping ideas and facts from public websites.

Because copyright does not protect ideas, facts, procedures, concepts, principles, or discoveries described or embodied in works, copying alone doesn’t constitute copyright infringement. To prove copyright infringement, one must prove that the defendant had access to the copyrighted work and that the defendant’s work is substantially similar to protected aspects of the first work.

For AI output to infringe upon a book, it must have taken a substantial amount of copyrightable expression from the author’s work. When it comes to text, GAI is an artful plagiarist. It knows how to dance around copyright. The predictive model emulates, it doesn’t copy. Insofar as text generated in response to a prompt is not substantially similar—a legal term of art—to the data it is scraping, it is not an infringement. In other words, don’t overestimate the value of litigation.

The fair-use doctrine is another limitation on the exclusive rights of authors. Its purpose is to avoid the rigid application of copyright law in ways that might otherwise stifle the growth of art and science. Fair use is highly fact specific. Which is another way of saying it’s a murky and contentious area of the law.

Several cases decided before the advent of GAI suggest fair use encompasses the ingestion and processing of books by GAI. For example, in 2015, in Authors Guild v. Google, the court ruled that Google’s digitizing of books without consent to create a full-text searchable database that displayed snippets from those titles was a transformative use that served a different purpose and expression than the original books.

Fair use favors transformative uses. However, over time, the concept evolved from using a protected work as a springboard for new insights or critiquing the original to taking someone else’s photographs or other images and including them in a painting and declaring it a fair use.

In 2023, in Andy Warhol Foundation for the Visual Arts v. Goldsmith, the U.S. Supreme Court held that the claim to fairness is severely undermined “where an original work and copying use share the same or highly similar purposes, or where wide dissemination of a secondary work would otherwise run the risk of substitution for the original or licensed derivatives of it.” AI-generated works can devalue human-created content, but is that the kind of economic harm contemplated in the Supreme Court’s decision?

To sum up, on a case-by-case basis, courts must determine if substantial similarity exists and then engage in line drawing—balancing free expression and the rights of creators.

. . . .

In an age of disinformation, an author’s brand, a publisher’s imprint, and the goodwill associated with them are valuable assets. I believe the industry is less vulnerable than many think. But, to quote Nick Lowe, “Where it’s goin’ no one knows.”

Link to the rest at Publishers Weekly

PG notes that the author of the OP is an attorney, so he will cut and paste his disclaimer from the post he just published so no one who reads only this TPV post will not be misled.

PG notes that nothing you read on TPV constitutes legal advice. If you want legal advice, you need to hire a lawyer, not read a blog post.

PG will also note that the OP includes some other suggestions by the author, who is an attorney, which you may want to consider, but hire your own lawyer because, just like PG, the author of the OP is not your attorney and isn’t giving legal advice by writing an article for Publishers Weekly.

2 thoughts on “Generative AI vs. Copyright”

  1. I’m actually surprised at how many people think machine learning is theft and generative AI produces derivative works and that there is no case for fair use.

    How many people have cried foul at lexicographers for daring to produce a dictionary from going over a corpus of copyrighted works? Like defining an artistic element off of a limited dataset will definitely produce infringement, but defining it off of 20k+ images from different artists, mediums, styles, etc. then licensing that dictionary for others to generate from? Sure sounds like fair use and something we’ve already done in a less automated fashion.

Comments are closed.