Getty Images is suing the creators of AI art tool Stable Diffusion for scraping its content

From The Verge:

Getty Images is suing Stability AI, creators of popular AI art tool Stable Diffusion, over alleged copyright violation.

In a press statement shared with The Verge, the stock photo company said it believes that Stability AI “unlawfully copied and processed millions of images protected by copyright” to train its software and that Getty Images has “commenced legal proceedings in the High Court of Justice in London” against the firm.

Getty Images CEO Craig Peters told The Verge in an interview that the company has issued Stability AI with a “letter before action” — a formal notification of impending litigation in the UK. (The company did not say whether legal proceedings would take place in the US, too.)

“The driver of that [letter] is Stability AI’s use of intellectual property of others — absent permission or consideration — to build a commercial offering of their own financial benefit,” said Peters. “We don’t believe this specific deployment of Stability’s commercial offering is covered by fair dealing in the UK or fair use in the US. The company made no outreach to Getty Images to utilize our or our contributors’ material so we’re taking an action to protect our and our contributors’ intellectual property rights.”

When contacted by The Verge, a press representative for Stability AI, Angela Pontarolo, said the “Stability AI team has not received information about this lawsuit, so we cannot comment.”

The lawsuit marks an escalation in the developing legal battle between AI firms and content creators for credit, profit, and the future direction of the creative industries. AI art tools like Stable Diffusion rely on human-created images for training data, which companies scrape from the web, often without their creators’ knowledge or consent. AI firms claim this practice is covered by laws like the US fair use doctrine, but many rights holders disagree and say it constitutes copyright violation. Legal experts are divided on the issue but agree that such questions will have to be decided for certain in the courts. (This past weekend, a trio of artists launched the first major lawsuit against AI firms, including Stability AI itself.)

Getty Images CEO Peters compares the current legal landscape in the generative AI scene to the early days of digital music, where companies like Napster offered popular but illegal services before new deals were struck with license holders like music labels.

“We think similarly these generative models need to address the intellectual property rights of others, that’s the crux of it,” said Peters. “And we’re taking this action to get clarity.” 

Although the creators of some AI image tools (like OpenAI) refuse to disclose the data used to create their models, Stable Diffusion’s training dataset is open source. An independent analysis of the dataset found that Getty Images and other stock image sites constitute a large portion of its contents, and evidence of Getty Images’ presence can be seen in the AI software’s tendency to recreate the company’s watermark.

Although companies like Stability AI deny any legal or ethical hazard in creating their systems, they have still begun making concessions to content creators. Stability AI says artists will be able to opt-out of the next version of Stable Diffusion, for example. In a recent tweet about the company’s training datasets, Stability AI CEO Emad Mostaque said “I believe they are ethically, morally and legally sourced and used,” before adding: “Some folks disagree so we are doing opt out and alternate datasets/models that are fully cc.”

The full details of Getty Images’ lawsuit have not yet been made public, but Peters said that charges include copyright violation and violation of the site’s terms of service (in particular, web scraping). Andres Guadamuz, an academic specializing in AI and intellectual property law at the UK’s University of Sussex, told The Verge it seemed like the case would have “more merit” than other existing AI lawsuits, but that “the devil will be in the details.”

Link to the rest at The Verge

PG’s understanding is that AI art generators don’t keep any copies of the images they use. His understanding is that an image is quickly analyzed and a mathematical hash is created.

If PG’s understanding is correct, Stability AI used an Open Source dataset and, perhaps, kept a copy of the original photos/artwork on its servers afterwards.

If this is the case, PG thinks it was stupidity on the part of management at Stability AI to keep a copy after creating hashes from it. While he thinks it would still qualify as fair use under US patent law, keeping a literal copy of the photos after processing them strengthens Getty’s case.

Getty is suing in the UK because Stability AI is located there, so PG’s comments, based on his understanding of US copyright laws may not cover differences between UK and US laws governing this matter. He will note that both the UK and US have entered into the two major international copyright agreements – The Universal Copyright Convention (UCC), adopted in Geneva, Switzerland, in 1952, and The Berne Convention for the Protection of Literary and Artistic Works (Berne), in 1882.

Getty is also aggressive in suing for improper use of photos in their collection, even if the photos are in the public domain and, thus, not protected by copyright. See here for one relatively recent example. And here for another take on the same facts.

PG is going to be following this lawsuit as it progresses and would welcome hearing from visitors to TPV if they find any interesting pieces discussing the dispute – Contact PG at the top of TPV will let you send him an email.

3 thoughts on “Getty Images is suing the creators of AI art tool Stable Diffusion for scraping its content”

  1. Stability’s “defense” will essentially be that (a) our kept-on-our-servers reference images are not any kind of substitute for or competitor of Getty’s catalog, but are merely used (b) to ensure validity in the event of algorithmic or compatibility changes (e.g., everyone suddenly jumps to 256-bit storage indexes and operating systems, which would change a hash).

    That’s a very US-oriented defense, presuming that the standard is “fair use.” It’s not; it’s “fair dealing,” a less-forgiving-of-a-reuser standard (specifically because unlike the US fair use standard, the amount of effort expended on the original/source remains a relevant consideration; US law rejects “sweat of the brow” (Feist). I say this because I’ve seen the exact same misguided defense put forth repeatedly in UK courts, and it is not a winner.

    And “transformative use” won’t help in the UK, either, because that’s also a theory of fair use that has little standing under a fair dealing standard.

    So much as I grumble at it, I have to say Getty has the better argument. My grumbling is because of rampant, but unrelated, misconduct by Getty regarding both copyrightability of vast swaths of its “collection” and misuse of its dominant market position in certain segments. It’s sort of like music piracy cases — my revulsion for piracy overcomes my disdain for standard industry practices.

  2. What caught my eye: “Stable Diffusion’s training dataset is open source. An independent analysis of the dataset found that Getty Images and other stock image sites constitute a large portion of its contents, and evidence of Getty Images’ presence can be seen in the AI software’s tendency to recreate the company’s watermark.”

    If true that would smell of gunpowder.

    • Watermarks? I once read about some guys who held up a liquor store while wearing team jackets. Jackets with their names on the back.

Comments are closed.