Four times more male characters in literature than female, research suggests

From The Guardian:

Researchers using AI technologies have discovered that male characters are four times more prevalent in literature than female characters.

Mayank Kejriwal at the University of Southern California’s Viterbi School of Engineering was inspired by work on gender biases and his own work on natural language processing to carry out the experiment.

Kejriwal and fellow researcher Akarsh Nagaraj used data from 3,000 books that are part of the Gutenberg Project, across genres including adventure, science fiction, mystery and romance.

The study used Named Entity Recognition (NER) to identify gender-specific characters by looking at things including female and male pronouns. The researchers also examined how many female characters were main characters.

“Gender bias is very real, and when we see females four times less in literature, it has a subliminal impact on people consuming the culture,” said Kejriwal. “We quantitatively revealed an indirect way in which bias persists in culture.”

But the researchers did face difficulties with those who didn’t fit into a gender binary. The AI was unable to figure out if “they” referred to a plural or a “non-dichotomous individual”.

Kejriwal said: “When we published the dataset paper, reviewers had this criticism that we were ignoring non-dichotomous genders. But we agreed with them, in a way. We think it’s completely suppressed, and we won’t be able to find many [transgender individuals or non-dichotomous individuals].”

As well as the statistics on male and female characters, the researchers also looked at the language associated with gender-specific characters. Nagaraj said: “Even with misattributions, the words associated with women were adjectives like ‘weak’, ‘amiable’, ‘pretty’ and sometimes ‘stupid’. For male characters, the words describing them included ‘leadership’, ‘power’, ‘strength’ and ‘politics’.”

Link to the rest at The Guardian

8 thoughts on “Four times more male characters in literature than female, research suggests”

  1. If these guys were using Project Gutenberg, then they were using works that are in the public domain, which means that this research is virtually useless for analyzing the present state of gender representation in literature.

  2. The massive bad faith and stupidity of stuff like this is stunning.

    More than that… I think that dwelling too much on this sort of insanity is bad for your health. One fears contamination. It’s as if one spent one’s time in a psycho ward for education and entertainment — hard to survive the experience with a sound mind.

    I’m only amazed that academic papers counting alphabetic letters in a corpus and looking for well-balanced diversity are not yet in evidence (at least, as far as I know…). All that prejudice against “x” while “e” wins all the best things in literary life.

  3. OK. The ratio is four to one. So what?

    Now, how about a hard-hitting study on adverbs, the Oxford comma, or any form or the verb, “Say?”

  4. If one were, for lack of better things to do, to take the OP seriously, one might ask if they ever thought to analyze romance novels. Methinks their ratios would run along entirely different lines.

    • Felix, the OP does say that they included works from the romance genre.

      But, as Tom so cogently noted, the study results are of extremely minor historical interest, at best, since anything included was from the 1950s and prior. (Except, maybe, for a few authors, such as H. Beam Piper, where the copyright was lost through stupidity.)

      Not that, observing the Amazon ads that I am bombarded with, there would probably be all that much difference. At least among the most promoted romances, I note that they have far more female POV, which is traditional – but the most popular appear to still be of the “manly man and womanly woman” characterizations.

      (By the way, they should have had no problem with “non-dichotomous” characters – I would make at least a small wager that there was a grand total of “zero” in their sample set.)

  5. Taking full advantage of advances in the science of gender dynamics, I decide on the genders of the characters in any book I read. I’m waiting for eBooks where I can choose the identities of the cast of characters, and all the pronouns are then properly aligned for my reading pleasure.

Comments are closed.