Meta ‘discussed buying publisher Simon & Schuster to train AI’

From The Guardian:

Staff at technology company Meta discussed buying publishing house Simon & Schuster last year in order to procure books to train the company’s artificial intelligence tools, it has been reported.

According to recordings of internal meetings shared with the New York Times, managers, lawyers and engineers at Meta met on a near-daily basis between March and April 2023 to discuss how it could get hold of more data to train AI models. From the recordings, which were shared by an employee of the Mark Zuckerberg-owned company that owns Facebook and Instagram, the New York Times found that staff had discussed buying Simon & Schuster and some had debated paying $10 per book for the licensing rights to new titles.

Simon & Schuster is one of the English-speaking world’s major book publishing houses and is part of what is referred to as the “Big Five”, along with Penguin Random House, HarperCollins, Hachette and Macmillan. Simon & Schuster’s authors include Stephen King, Colleen Hoover and Bob Woodward.

In March 2020, Paramount Global, the parent company of Simon & Schuster, announced its intention to sell the publisher. After a much-criticised planned merger with Penguin Random House was blocked by US courts, Simon & Schuster was eventually sold to private equity firm KKR in August 2023.

According to the recordings, Ahmad Al-Dahle, Meta’s vice president of generative AI, told executives that the company had used almost every book, poem and essay written in English available on the internet to train models, so was looking for new sources of training material.

Employees said they had used these text sources without permission and talked about using more, even if that would result in lawsuits. When a lawyer flagged “ethical” concerns about using intellectual property, they were met with silence.

. . . .

Maria A Pallante, president of the Association of American Publishers, does not believe that Simon & Schuster would have agreed to such a sale. “The fact that Meta sought to purchase one of the most important publishing houses in American history in order to ingest its venerable catalogue for AI profits is puzzling even for Big Tech,” she said. “Did Meta plan to trample the primary mission of Simon & Schuster, and its contractual partnerships with authors, by sheer power?”

Link to the rest at The Guardian

As the OP makes clear, Simon & Schuster is anything but an independent company. America’s third largest trade publisher was owned by Paramount Global, which sold it to KKR in 2023 for $1.62 billion. Meta’s market value at the end of 2023 was almost $500 billion. Its annual revenues for the year were almost $135 billion, and its profit margin was 28.98% for the year.

Purchasing S&S to provide content for Meta’s AI would probably have been a smart move because S&S’s value as a publisher was far less than its value as one of many content providers to prime Meta’s AI program.

As PG has opined on previous occasions, he questions the claims of more than a few traditional publishers that using their books to train an AI system somehow violates the copyrights of the authors they publish. In part this is because PG’s understanding that creating an AI does not create copies of the original work. Regardless of the query to an AI program, it won’t produce a copy of any book or any other document to the person making the query.

PG is happy to receive information—likely originating with an experienced copyright attorney or long-time law school professor—that indicates his opinion regarding no copyright infringement in the creation of an AI is incorrect.

1 thought on “Meta ‘discussed buying publisher Simon & Schuster to train AI’”

  1. PG, I believe that you are absolutely correct. Writing a prompt of “write a novel about a huge white whale and a ship captain obsessed with killing it” will never produce Moby Dick.

    However, with several hundred hours of human time, and a few dozen novels worth of prompts, this would probably be possible.

    Faster (and certainly less odoriferous and noisy) than chaining ten thousand monkeys to ten thousand typewriters – but still just as much a waste of time and effort! (Since it is already on Gutenberg…)

    Reply

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.