Audiobooks Are Thriving, but Could AI Take Over?

From CNet:

Stomachs gurgle. That’s normal. Sometimes, if there’s a mic nearby, those burbles and gurgles get picked up.

AI audiobook narrators don’t have to worry about strange gastrointestinal noises, but Leah Allers and engineer Craig Hinkle aren’t bots. They’re human beings, recording for Nashville Audio Productions in mid-January, fretting about gurgles, discussing where to put the emphasis on the word “increase” and tending to the detailed work of giving a “real” voice to a book about how couples communicate. 

NAP’s studio is at the Rukkus Room in Nashville, Tennessee, the same place Taylor Swift recorded her seven-time platinum, self-titled debut album. The smell of coffee permeates the waiting room. Hinkle is tuned in to every word coming out of Allers’ mouth, glancing from an iPad with the book’s text to a large monitor sitting on the soundboard in the studio.

“I want to get some more emotions in these questions,” Allers tells Hinkle before restarting a section of a chapter. 

Audiobooks are booming. The market is expected to hit $33.5 billion by 2030, up from about $4.2 billion in 2021, according to Acumen Research and Consulting. Whether this is an offshoot of the rise in popularity of podcasts, a matter of listening convenience or a byproduct of the pandemic, it hasn’t escaped the attention of tech companies and the inevitable creep of artificial intelligence. 

. . . .

Tech companies including Apple and Google have been working on AI audiobook narration for a while now. In 2022, Google rolled out its services to publishers in six countries, including the US and Canada. Google’s AI narrators have names like Archie, who sounds British, and Santiago, who speaks Spanish. In early January, Apple introduced a stable of AI voices with names like Madison and Jackson, that authors and indie publishers selling their books on Apple Books can tap to read genres from nonfiction to romance. 

The increasing presence of AI in audiobook narration has human narrators like Tanya Eby in various stages of stress. 

“I don’t know if, in five years, this will be my full-time gig anymore,” said Eby, a Grand Rapids, Michigan-based narrator who’s recorded more than 1,000 books in the last 21 years.

Narrators like Eby say their humanity is exactly what helps them do their jobs. Particularly with fiction, narrators make decisions about everything from a character’s voice to how to communicate nuance and emotion in a way that mirrors the story. 

“If a character is sobbing after the death of their father, I have to convey those tears and gasps in her speech,” said Kathleen Li, an Austin, Texas-based narrator.

Narrators describe the intimacy of being a voice in a listener’s ear, and wonder if even the most lifelike AI will fall into the uncanny valley. The danger, they worry, is disrupting the experience.

AI voices can range from stilted to quite convincing. But even the most fluid can set off those uncanny valley tripwires with a delivery or pacing that sounds off. 

Link to the rest at CNet and thanks to F. for the tip.

PG claims no expertise in audiobooks although he has listened to several, generally on long trips in the car with Mrs. PG.

That said, his understanding is that an audiobook narrator doesn’t interpret the book – give a performance like a voice actor does – but rather provides a pleasant narrative that doesn’t intrude into the story being experienced by the reader/listener.

From Gravy for the Brain:

What Is Voice Over?

Voice over also known as voice acting, is part art, part perspiration and a whole lot of practice. In this post, we are going to give you an insight into the amazing, exciting and fun world of voice acting and becoming a voice-over artist.

When we think about what is voice acting, we often hit the first problem. People don’t realise how often they hear voice acting in their everyday lives.

Voice acting is extremely varied so, let’s, first of all, establish: “what is voice over?”

It is commonly believed that the first voiceover was created by Walt Disney for Mickey Mouse in “Steamboat Willie.” Although this was in 1928, in reality, the first voice-over was performed in 1900! This historical first belongs to Reginald Fessenden, a Canadian inventor. He was thrilled with Alexander Graham Bell’s new device, the telephone, and set out to create a way to remotely communicate without wires. The beginning of “Wireless!”.

In 1900, working for the United States Weather Bureau, Fessenden recorded the very first voice over:  reporting the weather.

It is generally accepted that he was the first voice on the radio. In Boston, in 1906, during the Christmas season, he recorded an entire program of music, Bible texts, and Christmas messages to ships out at sea.

What is voice over acting then?

Well, as communications developed, voice acting became more common in radio, animated cartoons, etc. The actors behind those voices were rarely known by the public with perhaps the exception of the eponymous Mel Blanc, a radio personality and comedian. He became known as “The Man of 1000 Voices” for his versatility and is the voice on many cartoons that were made and distributed by Warner Brothers.

One of the most influential and prolific voice-over artists of all time is not commonly known by the public, but very well known in the industry. This is Don LaFontaine, who began voice acting in 1962, recording VO for a movie trailer.

He became the voice of movie trailers and the sound of the cinema for a generation of moviegoers, setting the gold standard for how they were written and voiced.

While voice-over acting has grown into being a recognised career path, it still remains unseen and largely unknown by most people. Most voice-over work is still done by classically trained actors who often use voice acting to fill gaps in-between jobs. However, voice acting is increasingly getting noticed and gaining recognition as true performance art and profession in its own right.

Famous actors have gained huge amounts of publicity from box-office animation successes such as those produced by Pixar and Disney. Actors like Liam Nielsen have essentially played leading roles in films through their voice, he was the Lion in Narnia series. People now expect well-known actors to be in animated films. Of course, there are other benefits. Studios can use the name of the stars that appear in the animated films to globally promote these films.

. . . .

Voice Over Announcers can be heard introducing segments of live television or radio broadcasts such as; award shows, talk shows, continuity, promo and sporting events.
Voice Over Narrators often specialise in audiobooks, documentaries, explainer videos, educational videos, business videos, medical videos and act as audio tour guides.
Voice Actors are heard performing in animated movies, TV cartoons, radio dramas, ADR, video games, puppet shows and in foreign language dubbing.
Voiceover Artists are versatile performers, able to weave interchangeably between any of the above as well as direct telephone prompts (IVR), they can be heard welcoming visitors to a website, or guiding road trips as the voice of a GPS.

Voice Talent refers to all of the above. The term was coined as an easy way to reference all types of voice-over performers and is often used by agencies or companies that hire voice overs.

. . . .

Some well-known voiceovers by type of work:

  • movies – Star Wars: Darth Vader – James Earl Jones
  • Animated Movies – Toy Story: Woody – Tom Hanks
  • Animated TV – The Simpsons – Hank Azaria
  • X-Factor UK and 2012 Olympic Games – Peter Dickson
  • Commercials (UK) – The Meerkat Comparethemarket – Simon Greenall
  • Promos TV (USA) – Joe Cipriano
  • Reality TV (UK) – Marcus Bentley – Big Brother

Link to the rest at Gravy for the Brain

So, here’s a question from PG: Would James Earl Jones make a good audiobook narrator?

5 thoughts on “Audiobooks Are Thriving, but Could AI Take Over?”

  1. Oh, yes.

    He has nine at Audible:

    https://www.audible.com/search?searchNarrator=James+Earl+Jones

    Not sure about elsewhere but in non-fiction alone his gravitas should be priceless.

    Back to the OP:

    The CNET piece lists a company that *wisely* licensed the voice characteristics of a handful of narrators for “AI” TTS audiobooks. Note that these are narrative projects, not audio dramas woth multiple voice actors like the Sandman audio books at Amazon with a heroic list of narrators reading different character:

    Neil Gaiman, Riz Ahmed, Kat Dennings, Taron Egerton, James McAvoy, Samantha Morton, Bebe Neuwirth, Andy Serkis, Michael Sheen.

    Not a cheap project.

    The third category, voice acting is an entirely different creature since it relies both on recognizable actors lending their voices to animation and gaming projects as well as industry specialists who are chameleons able to give different characters unique voices, even in the same production. The key word here is *actor*. The general public isn’t usually aware of them (though a few also operate in live action video) but in their chosen field their names are a recognizable asset, commanding significant payouts for their efforts.

    Of the three, the first is a slam dunk for software. It needs a bit of fine tuning but there is a clear business model there, especially for Indies. One particular low end evolution that jumps out is an implementation that instead of a licensed professional voice, mimics the author (or author provided) voice sample. This can be done and in tech if it can be done, it *will* be done.

    The second category is a can of worms because software already can imitate any sampled voice totally or (more contentious) in part. Human voice impersonators are rare but software voice impersonation is trivial. The limit has always been software that can properly follow the author’s narrative voice. That is, if not quite here, imminent.

    The third category is also legally dangerous but manageable, methinks, under personality rights. Not that those are cut and dried. Lots of billable hours in there.

    Again, all the handwringing over “AI” creativity (or lack-thereof) completely misses the point. The software doesn’t need to be creative because the humans using it will supply it.

    “AI won’t take your job but a human using AI will.”

    • This one shows the mix of live action and voice-mostly performers from the first sequel:

      https://m.youtube.com/watch?v=hj6S8yR3rvQ

      The difference is the live action performers provide recognizable voices that fit the character (and often their licensed likeness) while the career voice actors tailor their voice and performance to fit the character. In this particular franchise, the player character can be either male of female which triggers a different established voice actor (Mark Mears or Jennifer Hale) each providing a subtly different performance to mostly the same lines.

      Software is going to have a hard time replacing the likes of Ms Hale, for one. Or Mr Mears or Tara Strong (look her up).

      But for lesser characters, the “spear carriers” it should do fine.
      Times are changing.

      Some time in the near future we will be able to have meaningful, unscripted conversations with the video game characters. A whole new level of immersion.

  2. I don’t dismiss the appeal and advantages of AI narration, but as of now, it can’t replace the pleasant experience of someone like the later actor Edward Herrmann’s reading. I’ll never forget the way he delivered the passage in David McCullough’s “John Adams” when describing his daughter’s surgery.

    I’ve averaged listening to 65 audiobooks a year for 25 years on my long commutes. I also dabble in voiceover and have recorded for audiobooks. AI has its pros but certainly has some cons.

  3. While I agree that the industry standard for audiobook narration appears to be for the narrator to fade into the background – to get out of the story’s way, so to speak – that’s not universally true. For those unaware of Soundbooth Theater and the audiobook for Dungeon Crawler Carl, Jeff Hays’ voice characterizations take Matt Dinneman’s series to a whole new level. Even if the litRPG genre isn’t your cup of tea, it’s worth sampling just to get an idea how skilled voice work can knock an already stellar work of fiction out of the ballpark.

Comments are closed.