AI-generated content is raising the value of trust

From The Economist:

It is now possible to generate fake but realistic content with little more than the click of a mouse. This can be fun: a TikTok account on which—among other things—an artificial Tom Cruise wearing a purple robe sings “Tiny Dancer” to (the real) Paris Hilton holding a toy dog has attracted 5.1m followers. It is also a profound change in societies that have long regarded images, video and audio as close to ironclad proof that something is real. Phone scammers now need just ten seconds of audio to mimic the voices of loved ones in distress; rogue ai-generated Tom Hankses and Taylor Swifts endorse dodgy products online, and fake videos of politicians are proliferating.

The fundamental problem is an old one. From the printing press to the internet, new technologies have often made it easier to spread untruths or impersonate the trustworthy. Typically, humans have used shortcuts to sniff out foul play: one too many spelling mistakes suggests an email might be a phishing attack, for example. Most recently, ai-generated images of people have often been betrayed by their strangely rendered hands; fake video and audio can sometimes be out of sync. Implausible content now immediately raises suspicion among those who know what ai is capable of doing.

The trouble is that the fakes are rapidly getting harder to spot. ai is improving all the time, as computing power and training data become more abundant. Could ai-powered fake-detection software, built into web browsers, identify computer-generated content? Sadly not. As we report this week, the arms race between generation and detection favours the forger. Eventually ai models will probably be able to produce pixel-perfect counterfeits—digital clones of what a genuine recording of an event would have looked like, had it happened. Even the best detection system would have no crack to find and no ledge to grasp. Models run by regulated companies can be forced to include a watermark, but that would not affect scammers wielding open-source models, which fraudsters can tweak and run at home on their laptops.

The trouble is that the fakes are rapidly getting harder to spot. ai is improving all the time, as computing power and training data become more abundant. Could ai-powered fake-detection software, built into web browsers, identify computer-generated content? Sadly not. As we report this week, the arms race between generation and detection favours the forger. Eventually ai models will probably be able to produce pixel-perfect counterfeits—digital clones of what a genuine recording of an event would have looked like, had it happened. Even the best detection system would have no crack to find and no ledge to grasp. Models run by regulated companies can be forced to include a watermark, but that would not affect scammers wielding open-source models, which fraudsters can tweak and run at home on their laptops.

Yet societies will also adapt to the fakers. People will learn that images, audio or video of something do not prove that it happened, any more than a drawing of it does (the era of open-source intelligence, in which information can be reliably crowdsourced, may be short-lived). Online content will no longer verify itself, so who posted something will become as important as what was posted. Assuming trustworthy sources can continue to identify themselves securely—via urls, email addresses and social-media platforms—reputation and provenance will become more important than ever.

Link to the rest at The Economist

1 thought on “AI-generated content is raising the value of trust”

  1. Okay, I was waiting for somebody else to bring it up, but…

    The Economist is looking in the wrong direction.
    Trust is no longer up to the “consumer” but an *obligation* of the peddler of digital content.

    The key word is: authentication.

    And it is, these days, trivial to add authentication metadata to any form of digital content. Remember the NFT fad/scam of recent vintage? That used blockchain technology.
    It is that simple to provide a unique unremovable identifier to a photo, video, or audio file to identify its source. Similar features can and should soon show up in cameras, phones, and other content generation tools. Plus tester features in browsers and playback devices.

    If the file is legit, it’ll carry its source embedded. Not too different from bar codes or its succesor QRCODES. If it doesn’t (or has a fake one) it will be flagged as counterfeit.

    Simple, cheap, off the shelf tech.

    And necessary because of the dirty secret of LLM tech everybody in the media and political circles chooses to ignore: unregulated and *unregulatable* AI is already out in the wild.

    While everybody is handwringing and pearl clutching over the commercial products from OpenAI, Microsoft, Google, and a dozen startups, there are dozens if not hundreds of university and hacker models out online. And their derivatives. LLAMA from Facebook and ORCA from MS are just two.

    Worse yet, there are publicly available training datasets that can be used to generate models at will. Try this:

    https://venturebeat.com/ai/one-of-the-worlds-largest-ai-training-datasets-is-about-to-get-bigger-and-substantially-better/

    As THE VERGE reports, an earlier version of THE PILE training dataset included BOOK3, a plain text compilation of 200,000 books of dubios sourcing. This may or not be removed from THE PILE but adding it or any other content is no challenge for the interested developer.

    The internet is forever and once something gets on it and is noticed, it stays out there. Somewhere.

    So you have publicly available models *and* publicly available training datasets and PCs with GPUs and CPUs and you have freely available *unconstrained* AI tools, available for tbe determined.

    While activists and politicians fret over the commercial “guardrailed” tools that can’t be used for dubious purposes, nobody is paying attention to how the real problem, the unfettered tools that can be used for…anything digital.

    Just one reminder: every new technology that has ever emerged has been adapted to produce and distribute porn. In the last two centuries alone, cameras, cable TV, VCRs, DVDs, the internet. Big business at times (Once upon a time, TIME WARNER was the biggest pornographer on Earth, making hundreds of millions off “adult content” via video on demand. The internet ended that by allowing tech-savvy amateurs to make and distribute their own.)

    It’s already happening. And will continue.
    “Hey, DJINN, remove the car in the background of this photo” can easily become: “Hey, DJINN remove the bikini from this online beach photo of (any random celebrity).”
    Or worse.
    “Hey Djinn, give me an audio file of Edward Hermann reading this text file.” can become “Hey, DJINN give me an a capella audio file of Taylor Swift singing Silent Night.” Or worse.

    There’s no going back.
    That Djinn is out of the bottle and focusing on people who can’t tell when a politician is lying to them to figure out if the video the geezer snorting cocaine is real or not to figure out what is real or not on their own is just stupid.

    The new default is *everything digital* is fake unless proven otherwise.

    Turns out NFTs were a trial run for what needs to be done. Authentication everwhere.

Comments are closed.