From National Public Radio:
The New York Times and OpenAI could end up in court.
Lawyers for the newspaper are exploring whether to sue OpenAI to protect the intellectual property rights associated with its reporting, according to two people with direct knowledge of the discussions.
For weeks, the Times and the maker of ChatGPT have been locked in tense negotiations over reaching a licensing deal in which OpenAI would pay the Times for incorporating its stories in the tech company’s AI tools, but the discussions have become so contentious that the paper is now considering legal action.
The individuals who confirmed the potential lawsuit requested anonymity because they were not authorized to speak publicly about the matter.
A lawsuit from the Times against OpenAI would set up what could be the most high-profile legal tussle yet over copyright protection in the age of generative AI.
A top concern for the Times is that ChatGPT is, in a sense, becoming a direct competitor with the paper by creating text that answers questions based on the original reporting and writing of the paper’s staff.
It’s a fear heightened by tech companies using generative AI tools in search engines. Microsoft, which has invested billions into OpenAI, is now powering its Bing search engine with ChatGPT.
If, when someone searches online, they are served a paragraph-long answer from an AI tool that refashions reporting from the Times, the need to visit the publisher’s website is greatly diminished, said one person involved in the talks.
So-called large language models like ChatGPT have scraped vast parts of the internet to assemble data that inform how the chatbot responds to various inquiries. The data-mining is conducted without permission. Whether hoovering up this massive repository is legal remains an open question.
If OpenAI is found to have violated any copyrights in this process, federal law allows for the infringing articles to be destroyed at the end of the case.
Link to the rest at National Public Radio
As PG has mentioned on a couple of previous occasions, he has doubts about the copyright infringement claims like the Times is asserting because, to the best of PG’s knowledge, no AI stores the original copyrighted works or is capable of reproducing them.
Instead, the contents of the Times plus a huge number of other texts are used to train the AI model, then deleted after training is complete. The AI can then utilize the ingested texts in order to come to an understanding of the meanings of the texts and use that understanding to create new expressions of knowledge as needed to respond to a wide range of queries and commands that individual users submit.
PG doesn’t think the AI can ever recreate the words of the original Times stories. The AI uses the information it has ingested to create new responses to tasks individual users want it to perform.
The analogy PG thinks is correct happens when he reads a story in the Times or elsewhere, then uses that knowledge to answer questions posed by others or to create other writings that don’t replicate the original Times articles and may include ideas, facts, etc. that he has picked up during his extensive reading of a large collection articles from a great many sources.