PG has been looking into contemporary plagiarism over the past several days and will be writing more than one post about the topic.
The problem is three-fold (or maybe more than three-fold. PG has learned about three elements):
1. When Amazon and others permit an author or plagiarist to self-publish books around the world in a large number of languages. How does an author even discover that plagiarism is taking place?
2. College and university professors (and some high school teachers) are increasingly likely to screen student papers use plagiarism detection software – Turnitin is one of the most popular tools. Some time ago, students learned that copying and pasting a paper or segments of various papers they found online was an easy shortcut to creating a paper to turn in by a class deadline. Sometimes the online sources even included footnotes formatted in proper academic form. Plagiarism detection software is designed to catch such activities.
3. Where there are electronic plagiarism weapons, almost inevitably, there will be electronic or other defenses that prevent detection of plagiarism – paraphrasing the plagiarized information is one tactic that has been used since well before Turnitin came into being. For further information, see, for example, How to Beat Turnitin in 2019 and Get Away with It
4. While many of the ways of beating academically-oriented plagiarism detection are focused on manipulating a student paper, other, more sophisticated computerized tools often referred to as “Spinners” or “Article Spinners” can be used to not only fool college plagiarism checkers, but also make it difficult for the author of a book to discover plagiarism and prove copyright infringement in court.
Article Spinners were developed for a period prior to Google’s search engines developing the intelligence they have today.
The goal for some search engine optimizers was to generate as many pages with key words of interest to Google and, thus, advertisers. The spinners were created to substitute various synonyms for parts of an article on a topic. Thus, “good” in the original article would be changed to “great” “super” “excellent”, etc., etc. Several different words would be spin-treated. Thus, one four paragraph article on fishing lures could be spun into a thousand articles about fishing lures, each seeming to be a different page to Google. If someone was searching for fishing lures, Google would rank the site with a thousand articles about fishing lures higher than a site with one article.
Google has become smarter, so spinning doesn’t work there any more, but spinning software is still around and has reportedly become more sophisticated. Pour the text of a romance ebook into spinning software and out comes another romance that has a similar plot but different character names, places, descriptions, etc.
PG understands that the products of current spinning software require a significant amount of editing, but, if you’re planning to sell an 80,000 word romance, it’s a lot less work to do a quick copy edit than to write a book, develop characters, etc., from scratch.
5. Artificial Intelligence software has become more and more sophisticated in the past couple of years and no one expects progress to stop. And it is currently being used to write stories. Bloomberg generates about half of its articles about public companies and their latest earnings releases using artificial intelligence.
From Forbes magazine in February, 2019:
How do you know I am really a human writing this article and not a robot? Several major publications are picking up machine learning tools for content. So, what does artificial intelligence mean for the future of journalists?
According to Matt Carlson, author of “The Robotic Reporter”, the algorithm converts data into narrative news text in real-time.
Many of these being financially focused news stories since the data is calculated and released frequently. Which is why should be no surprise that Bloomberg news is one of the first adaptors of this automated content. Their program, Cyborg, churned out thousands of articles last year that took financial reports and turned them into news stories like a business reporter.
. . . .
Forbes also uses an AI called Bertie to assist in providing reporters with first drafts and templates for news stories.
The Washington Post also has a robot reporting program called Heliograf. In its first year, it produced approximately 850 articles and earned The Post an award for its “Excellence in Use of Bots” from its work on the 2016 election coverage.
. . . .
The LA Times is using AI to report on earthquakes based on data from the U.S. geological survey and also tracks homicide information on every homicide committed in the city of Los Angeles. The site created by the machine called “Homicide Report” utilizes a robot-reporter with the ability to write drafts of stories that include that includes: the victim’s gender and race, cause of death, officer involvement, neighborhood and year of death.
. . . .
The AP estimates that AI helps to free up about 20 percent of reporters’ time spent covering financial earnings for companies and can provide better accuracy. This gives reporters more time to concentrate on the content and story-telling behind an article rather than the fact-checking and research.
Link to the rest at Forbes
Contemporary artificial intelligence is leagues beyond article spinners and detecting that the work of another author (or several other authors) as the source material for an AI writing romance or other types of book-length fiction or non-fiction may already be difficult or next to impossible.
PG is interested in this issue as it relates to copyright infringement in the 21st century and will have a few more posts