OpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make ChatGPT Less Toxic

This content has been archived. It may no longer be accurate or relevant.

From Time Magazine:

ChatGPT was hailed as one of 2022’s most impressive technological innovations upon its release last November. The powerful artificial intelligence (AI) chatbot can generate text on almost any topic or theme, from a Shakespearean sonnet reimagined in the style of Megan Thee Stallion, to complex mathematical theorems described in language a 5 year old can understand. Within a week, it had more than a million users.

ChatGPT’s creator, OpenAI, is now reportedly in talks with investors to raise funds at a $29 billion valuation, including a potential $10 billion investment by Microsoft. That would make OpenAI, which was founded in San Francisco in 2015 with the aim of building superintelligent machines, one of the world’s most valuable AI companies.

But the success story is not one of Silicon Valley genius alone. In its quest to make ChatGPT less toxic, OpenAI used outsourced Kenyan laborers earning less than $2 per hour, a TIME investigation has found.

The work was vital for OpenAI. ChatGPT’s predecessor, GPT-3, had already shown an impressive ability to string sentences together. But it was a difficult sell, as the app was also prone to blurting out violent, sexist and racist remarks. This is because the AI had been trained on hundreds of billions of words scraped from the internet—a vast repository of human language. That huge training dataset was the reason for GPT-3’s impressive linguistic capabilities, but was also perhaps its biggest curse. Since parts of the internet are replete with toxicity and bias, there was no easy way of purging those sections of the training data. Even a team of hundreds of humans would have taken decades to trawl through the enormous dataset manually. It was only by building an additional AI-powered safety mechanism that OpenAI would be able to rein in that harm, producing a chatbot suitable for everyday use.

To build that safety system, OpenAI took a leaf out of the playbook of social media companies like Facebook, who had already shown it was possible to build AIs that could detect toxic language like hate speech to help remove it from their platforms. The premise was simple: feed an AI with labeled examples of violence, hate speech, and sexual abuse, and that tool could learn to detect those forms of toxicity in the wild. That detector would be built into ChatGPT to check whether it was echoing the toxicity of its training data, and filter it out before it ever reached the user. It could also help scrub toxic text from the training datasets of future AI models.

To get those labels, OpenAI sent tens of thousands of snippets of text to an outsourcing firm in Kenya, beginning in November 2021. Much of that text appeared to have been pulled from the darkest recesses of the internet. Some of it described situations in graphic detail like child sexual abuse, bestiality, murder, suicide, torture, self harm, and incest.

OpenAI’s outsourcing partner in Kenya was Sama, a San Francisco-based firm that employs workers in Kenya, Uganda and India to label data for Silicon Valley clients like Google, Meta and Microsoft. Sama markets itself as an “ethical AI” company and claims to have helped lift more than 50,000 people out of poverty.

The data labelers employed by Sama on behalf of OpenAI were paid a take-home wage of between around $1.32 and $2 per hour depending on seniority and performance. For this story, TIME reviewed hundreds of pages of internal Sama and OpenAI documents, including workers’ payslips, and interviewed four Sama employees who worked on the project. All the employees spoke on condition of anonymity out of concern for their livelihoods.

The story of the workers who made ChatGPT possible offers a glimpse into the conditions in this little-known part of the AI industry, which nevertheless plays an essential role in the effort to make AI systems safe for public consumption. “Despite the foundational role played by these data enrichment professionals, a growing body of research reveals the precarious working conditions these workers face,” says the Partnership on AI, a coalition of AI organizations to which OpenAI belongs. “This may be the result of efforts to hide AI’s dependence on this large labor force when celebrating the efficiency gains of technology. Out of sight is also out of mind.” (OpenAI does not disclose the names of the outsourcers it partners with, and it is not clear whether OpenAI worked with other data labeling firms in addition to Sama on this project.)

. . . .

One Sama worker tasked with reading and labeling text for OpenAI told TIME he suffered from recurring visions after reading a graphic description of a man having sex with a dog in the presence of a young child. “That was torture,” he said. “You will read a number of statements like that all through the week. By the time it gets to Friday, you are disturbed from thinking through that picture.” The work’s traumatic nature eventually led Sama to cancel all its work for OpenAI in February 2022, eight months earlier than planned.

. . . .

Documents reviewed by TIME show that OpenAI signed three contracts worth about $200,000 in total with Sama in late 2021 to label textual descriptions of sexual abuse, hate speech, and violence. Around three dozen workers were split into three teams, one focusing on each subject. Three employees told TIME they were expected to read and label between 150 and 250 passages of text per nine-hour shift. Those snippets could range from around 100 words to well over 1,000. All of the four employees interviewed by TIME described being mentally scarred by the work. Although they were entitled to attend sessions with “wellness” counselors, all four said these sessions were unhelpful and rare due to high demands to be more productive at work. Two said they were only given the option to attend group sessions, and one said their requests to see counselors on a one-to-one basis instead were repeatedly denied by Sama management.

Link to the rest at Time

PG wonders if there isn’t a better way to engineer an AI product to identify and avoid toxic documents in its construction of a database.

He also thinks this is an example of when Sili Valley’s long-standing motto, “Go fast and break things,” should involve some adult judgment in the on the part of someone with authority in the organization..

One of the oldest bits of business advice is, “Know your suppliers.” Evidently, the management at OpenAI all missed the class where that was discussed.

PG notes that his brothers and sisters of the bar are not immune to the “smart people doing dumb things” behavior pattern – see, for example, Why Toxic Culture Is To Blame For Women Leaving Law Firms.

7 thoughts on “OpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make ChatGPT Less Toxic”

  1. More context:

    https://www.forbes.com/sites/jamesconca/2018/09/26/blood-batteries-cobalt-and-the-congo/?sh=21ee8c8acc6e

    “With support from the Pulitzer Center, Walt and Meyer focused on the lives of the poor laborers in this former Belgian colony, especially children, and how their exploitation is making our lives easier.

    Especially one child named Lukasa who gets up at 5 AM to work a 12-hour day for less than $9, hacking at ore by hand and carrying it on his back to a trading post an hour trek from the mine, before starting his two-hour walk back home.”

    Walt describes how the multibillion-dollar industry, that has made some people outside Africa really really rich, is not known to workers like Lukasa. He just sells his haul to Chinese traders who have seen their profits increase 400% over the last two years.

    Unfortunately, the Democratic Republic of Congo is pervaded by conflict, poverty and corruption. The country’s economy is completely dependent on mining. Many poor families are completely dependent on their children working the mines. That $9/day is hard for a child to reject.

    Competing jobs pay even less, and there are few of those.”

    https://m.youtube.com/watch?v=JcJ8me22NVs&embeds_euri=https%3A%2F%2Fwww.bing.com%2F&feature=emb_imp_woyt

    This kind of exploitation will end in a few years: There are new chemistries being developed to replace Cobalt with more common materials like sodium. It won’t help kids in Congo but the first world folks will sleep easier knowing their reduced “carbon footprint” isn’t coming by exploiting kids.

    This too is life in the 21st century.

  2. Context:

    https://salariesplus.com/minimum-wage-in-kenya/#:~:text=Average%20wage%20income%20in%20Kenya%20The%20average%20wage,sector%2C%20but%20the%20informal%20sector%20is%20much%20larger.

    “The average wage income in Kenya is KSh684,097, which is roughly the equivalent of $230 per month. This figure relates to the formal sector, but the informal sector is much larger. According to the Kenya National Bureau of Statistics, approximately 39% of Kenyan workers are paid less than this figure, and two-thirds of the workforce is unemployed. However, it is important to note that this figure does not reflect the real situation, as the informal sector employs the majority of the country’s population.”

    Also:

    The overall contract with Sama was $200,000, and that contract stipulated it would pay “an hourly rate of $12.50 to Sama for the work, which was between six and nine times the amount Sama employees on the project were taking home per hour.”

    Later, Sama began to pilot a new project for OpenAI unrelated to ChatGPT. However, instead of text this time, it was imagery, including some illegal under US law, such as child sexual abuse, bestiality, rape, sexual slavery, and death and violence. Again, workers were to view and label the content so that OpenAI’s systems could filter out such things.

    Sama, however, canceled all work with OpenAI soon after. The company says it did so over concern over the content (and legal issues). However, employees were told that TIME’s earlier investigation into Sama and Facebook, where Kenyan employees were used as moderators, prompted the turnaround due to the bad PR the company was receiving.

    Of course, while Sama is out of the picture for OpenAI, the work must continue, and it’s unclear who else is doing it.

    https://www.windowscentral.com/microsoft/the-human-cost-behind-chatgpt-is-worse-than-you-think

    My guess is the video work will end up in Thailand.

    Planet earth is not a nice place.
    Or, as they say, “never watch sausage being made”.

Comments are closed.