From Public Books:
“Is this the real life? Is this just fantasy?” raps Kanye West, in what sounds like an a capella cover of Freddie Mercury’s timeless opening lines. The performance is so convincing, you might be surprised to learn that it never really happened. Thanks to new AI, users can create vocal “deepfakes” of their favorite celebrities, the most popular example being a viral performance of Queen’s “Bohemian Rhapsody” in the disturbingly realistic “voice” of Kanye West.
It is incredibly easy to create a vocal performance using a famous rapper’s voice with the help of Uberduck, a popular AI-driven text-to-speech (TTS) synthesis engine. After logging in with a Gmail or Discord account, users can select from a drop-down menu of different categories (such as “rappers”) and then specify the individual voice within that category (such as the artist “Kanye West”). The user is then directed to either enter the text they wish to hear or select prewritten versions of sung and spoken snippets. After they instruct the system to “synthesize” all the information, the text is rendered audible, and the user has the chance to engage in further vocal processing, including changing the speed, pitch, and word length. Within minutes the performance is ready to be downloaded, overlaid on a TikTok video, and shared.
. . . .
In recent years, terms like “high-tech blackface” and “digital blackface” have become popularized, as scholars on race and media have begun to theorize how this dialectic shows up in unique ways in the technologies of the digital age, enabling non-Black people to adopt Black personhood through their avatars and across networked platforms like Facebook and Twitter. Much has been said by scholars, cultural critics, and everyday observers about the use of African American Vernacular English (AAVE) and the “blaccent” by non-Black people and companies seeking to harness the selling power of Black culture through tweets, memes, and other forms of quick content, with no investment in actual Black communities or people. Tools like Uberduck might therefore meaningfully be understood as extending these kinds of appropriative digital practices into the realm of sonic performance.
In many ways, my specific concerns about Uberduck are connected to broader developments that I have observed in regard to rap music, AI, and the veneer of techno-optimism that increasingly brings these worlds together. I am a Black feminist rapper with a PhD in science and technology studies (STS), a field that examines the social relations that coproduce scientific and technological knowledge and practices. As such, I have long been interested in exploring our dominant narratives about the technologies we make and use. So I couldn’t help but raise an eyebrow when a succession of stories at the intersection of rap performance and AI flitted across my radar last spring: first I was introduced to FN Meka, an “AI robot rapper” who, perhaps unsurprisingly, also sells NFTs. Around the same time, Google Arts & Culture announced the Hip Hop Poetry project, led by creative technologist Alex Fefegha, to answer the question of whether AI can rap. A few weeks later, I learned about the success of Uberduck imitating Kanye West. I listened only once before putting down my phone in discomfort.
I’ve since begun to think more deeply about the messaging around AI that emanates from stories like these—about whether, the creepiness and potential legal thorniness aside, we should uncritically accept the use of AI as a mode for crafting rap lyrics and performances. I worry that in our excitement to explore these new creative potentials we risk reproducing the same exploitative dynamics that currently separate Black and brown artists from the fruits of their labor, across music and countless other forms of entertainment.
Link to the rest at Public Books
Case study: Yotta
Yotta contacted Uberduck in late 2021, wanting to create a memorable end-of-year wrap-up for Yotta’s users.
In two weeks, Uberduck helped Yotta create and ship 150,000 professionally produced rap songs with lyric videos, every one customized to each individual user. (Check out the video at the top of this page for an example.)
Yotta’s users loved their raps and shared them across social media, driving hundreds of new checking accounts.
“We aren’t your typical bank and wanted to stand out from the crowd with our year-end project. Yotta Rapped was just that – a fun and personal look at each user’s individual journey with Yotta over the past year. It wouldn’t have been possible without Uberduck.”
Adam Moelis – Co-Founder/CEO, Yotta
Link to the rest at Uberduck.ai
Uberduck includes an applet on its website that allows anyone to post a short bit of text, then synthesize it into an audio message after choosing a voice from what looks like a large number of users.
PG synthesized the following message using a voice titled “Casey Kasem” from a category called Radio Hosts.
Here’s the text, pulled from the Uberduck website:
Yotta contacted Uberduck in late 2021, wanting to create a memorable end-of-year wrap-up for Yotta’s users. In two weeks, Uberduck helped Yotta create and ship 150,000 professionally produced rap songs with lyric videos, every one customized to each individual user.
And below is the 15 second audio Uberduck created.