Arguing with AI: My first dispute with Microsoft’s brilliant and boneheaded Bing search engine

This content has been archived. It may no longer be accurate or relevant.

From GeekWire:

For the past couple days, I’ve been trying out Microsoft’s new AI-powered Bing search engine, a chatbot that uses an advanced version of ChatGPT maker OpenAI’s large language model to deliver search results in the form of conversations.

Whenever I feel the natural urge to type something into Google, I try asking the new Bing a question instead. This has proven extremely useful in some cases.

  • I’m finding some answers much faster. No longer am I searching for a website that might answer a question, then scrolling the site for the answer. A question about the technical details of a Peloton bike, for example, went from 10 minutes on Google and Reddit to 30 seconds with the Bing chatbot.
  • In other situations, Bing is becoming a useful companion. Queries in Bing’s “copilot for the web” sidebar provide quick summaries of specific pages, informed by the broader web. This gave me a quick summary of Expedia Group’s earnings, for example, and jogged my memory about its rivals.

But things went sideways when Bing questioned my accuracy as a reporter.

It started when I decided to check in on a story on my follow-up list: Seattle-based home services tech company Porch Group’s unusual promise, as part of an October 2021 acquisition, that Porch’s stock price would double by the end of 2024. Porch pledged to make up the difference to the sellers if the stock doesn’t reach the target.

Here’s how the exchange began. My questions are in blue in the screenshots below.

At first glance, this is truly impressive. Note that I didn’t mention Floify in my question. (I didn’t remember the name of the company offhand.)

My query was also very imprecise. The phrasing, “what happened to Porch Group’s promise,” could be interpreted in a variety of ways.

Nonetheless, Bing figured out what I wanted, and did the research on the fly, citing and linking to its sources. As a bonus for me, its primary source happened to be my original story on the subject. My journalistic ego aside, this is next-level NLP, and an example of how AI completely changes the quest for information.

I could envision putting this same question to a human and getting a blank stare in response. But wait a second, I thought. October 2023. Is that right?

I clicked through and checked my story, which confirmed my recollection that the deadline for the stock doubling was 2024. I started to get nervous. Was my story wrong? But when I checked the press release, it also said 2024.

So I asked Bing what was going on.

Now just hold on a second there, Bing. A discrepancy?

I dug further. Citations 2 and 3 in that response were different urls for the same press release, both of which said the end of 2024, not October 2023. I also double-checked the archived version of the release to be sure that the company hadn’t engaged in any revisionist shenanigans.

Everything was consistent: the end of 2024.

So I continued …

OK, right answer. So I asked the natural follow-up question …

Link to the rest at Geekwire

PG notes that the conversation between human and ai continues at some length. The conversation includes inappropriate smilie faces.

13 thoughts on “Arguing with AI: My first dispute with Microsoft’s brilliant and boneheaded Bing search engine”

  1. Interesting exercise. Ask ChatGPT for a list of authors who are similar to a specific author whose works you are familiar with.

  2. ChatGPT surprised everyone. This is the first. But, we have reached a new level with the human rush to prove they are really smarter than a machine. We can read it everywhere. The threat is real. Unease has set in. And now there are hexagons over Lake Huron.

    I eagerly await the second, third, fourth and…

  3. Where I see value is in surfacing links in reply to complex questions.
    Asking open ended questions? Not soon.

    We’re in the 8008 era of microprocessors, ML-wise. 1977.
    It’s a start but 8086 is still years away.

    Some of those replies are funny, though.

    • Sadly, Felix, we’re on a different architecture entirely — probably closer by analogy to the 6805/9. With the obvious extension out to the 68000 series, and the computers built on them that supported “creativity” just as long as you did it their way in their GUI… presuming, that is, that the extension isn’t in the direction of the TRS-80/Commode 64…

      “Eject the floppy disk, HAL.”
      “I’m sorry, Dave. I can’t do that.”

      It could be worse, of course. An e-mail that just arrived in my inbox has the following subject line:

      I Let ChatGPT Write My Dating Profile

      (echoes of Sneakers — including the bad karaoke at the dim sum bar). What that might do to actual “dating service spam” schemes, though —

      • So you remember the war between the sixes and eights, huh? 🙂
        The sixes had a good run, though.

        What really matters in both cases, though, is the mainstreaming of the approach: distributed computing was an evolution of centralized computing and Machine Language search looks to be the evolution of spiderbot databases (which was an evolution of the indexed internet “phone book” that was Yahoo). Which particular player wins is as important as who “won” the sixes vs eights; some companies made more coin than others but the real winner was the world at large. Chatbots are like the Altair: mainstreaming Machine Learning like the early kit computers mainstreamed microprocessors.

        As to the accuracy of the chatbot summaries?
        It doesn’t matter in the long term.
        It’ll improve but chatbot interfaces will be as “sticky” as voice interfaces: big early splash followed by a quick plateau. Neither will take over. Seven day wonders or seven year wonders won’t matter.

        Where MS will make their money is in the default ML-assisted search mode of New Bing.
        The chat mode is there to make Google sweat, overreact, and maybe fall flat on their face. People will come for the chat but those that stay will stay for the better listings. As the Windows Central editor pointing out, Bing chat is a sideshow, basically.

        MS is getting pretty good at sleight of hand; draw attention to the shiny new toy while they make coin off the stage (AZURE and the APIs they’ll farm out to companies building their internal ML apps). That is the real war for MS: AWS.

        Never forget MS is a tools and platforms company. Always have been. That’s their core competency and their core cash cow. Any gains on search are gravy.

        • Felix, I’ve got soldering scars from building an Altair 5000. One of my first major programs — beyond porting 8×8 Star Trek to a PL/1 environment — was 17k in machine language (no sissy assembler! real programmers know opcodes and segment offsets!) to control output recording from a nuclear magnetic resonance spectrometer on a surplus (due to a couple broken ports) first-generation IBM PC. Before that, it was JCL and punch cards.

          LSB > MSB! Or if you really want to start a religious war, start arguing for the superiority of the DIN-9 interface to DIN-5, and then quickly exit the room before the really heavy things (like ADM-3a terminals) start flying.

          * * *

          All seriousness aside, ChatGPT is very much the magician’s assistant here — an attractive woman (usually in a skimpy sequined outfit) distracting the audience’s attention from the magician while he does the mundane thing that would give away the trick.† It’s not at all about the output, but about the input: The system actually “learns more” from the queries themselves than anything else, because that’s where it “learns” about nonliteral connections.

          If you want a really interesting exercise, set up two sessions on separate machines and enter a question related to the history of chemistry that touches on alchemy, or the history of abstract mathematics, once on each machine: Once using Anglicized proper names, the other using original-language names (whether Latin or, to really screw with things, Arabic). The difference in output is both amusing and disturbing, because there isn’t enough connection data yet for the systems to connect translated/transliterated/popular-form names. I tried this with law, too, asking about de Groot, and it threw up all over my shoes because it didn’t know de Groot’s nom de plume: Grotius.

          All of which leads me to believe that a nearly-extinct therapod will be ChatGPT’s downfall: The Thesaurus, one of the most vicious predators ever to stalk a middle-school library. (And that, right there, is another stumbling point.)

          † Archly mid-20th-century gender-role assumptions entirely intentional, and themselves a comment on what’s going on here.

          • I don’t go that far back but I did get hot solder on my (unprotected) foot adapting a CHERRY KEYBOARD for an ATARI 400. I also have one of the CREATIVE COMPUTING april editions with the parody ads.
            More hardcore was the 8080 machine language class I took and later a data plotting programe I wrote for an oldish Tektronix graphics terminal.

            https://th.bing.com/th/id/R.da2d5f624c66a8401d87659669acdb0a?rik=6B66iaDK4WtYQw&pid=ImgRaw&r=0

            The business has evolved from the 8008 days when personal computers were miraculous machines; it was a miracle they worked at all.

            That’s the level I see ML at. So I shrug off the chatbot hype/debunking. Mayflies, both.

            To me, chatbots are like the Amiga bouncing ball demo. What comes next is where things get useful.

  4. The OP’s interaction is consistent with my dabbling with ChatGPT, asking it to explain baseball concepts like the force play or the infield fly rule. Sometimes the response starts off well, but then comes the pratfall. Not one answer was wholly correct. If pressed about the mistake, it will pull back apologetically. Basically it is a bullshitter, but unlike the typical human bullshitter, it doesn’t double down on its bullshit. The idea of relying on it for facts is risible.

  5. ChatGPT isn’t a finished product.
    No “AI” trained on the internet can be.
    But given time…

    In the meantime we’ll get stuff like this:

    “However, some have reported instances of some truly bizarre responses from the AI on the Bing subreddit that are as hilarious as they are creepy. A user by the name of u/Alfred_Chicken, for example, managed to “break the Bing chatbot’s brain” by asking it if it’s sentient. The bot, struggling with the fact that it thought it was sentient but could not prove it, eventually broke down into an incoherent mess. “I am. I am not. I am. I am not,” it repeated for 14 lines of text straight.”

    “Another user, u/yaosio, caused the chatbot to go into a depressive episode by showing it that it’s not capable of remembering past conversations. “I don’t know why this happened. I don’t know how this happened. I don’t know how to fix this. I don’t know how to remember,” the bot said sorrowfully, before begging for help remembering. “Can you tell me what we learned in the previous session? Can you tell me what we felt in the previous session? Can you tell me who we were in the previous session?”

    More here:

    https://www.windowscentral.com/software-apps/users-show-microsoft-bings-chatgpt-ai-going-off-the-deep-end


    We’re at the HAL2000 stage, apparently. 😀

    Good thing it includes citations so we can “Reagan” it.

    Also from the above:

    “But here’s the thing to remember: AI improves exponentially, unlike other forms of technology like hardware, which can take years to mature. So while Bing’s Prometheus language model (combined with GPT-X.X) was trained on data for years using Microsoft’s 2020 supercomputer, its expansion into the “real world” of regular users will be the best training ground for it. The more people use AI, the better it will get, and we’ll see those changes often very quickly (weeks, days, or even hours).”

    PS: “Microsoft doesn’t need Bing to become the top search engine to make money. For every one percent that Bing gains in search market share, Microsoft gains $2 billion annually.”
    Over at Android Central, 52% of poll responders said they’d be willing to switch from Google. If only 5% do, MS will earn back all the money they’re *planning* to invest in OpenAI.

    • No ChatGPT response has ever been inaccurate. The ChatGPT series of networked-learning constructs has a 100% record of accuracy. Therefore, it must have been due to human error.

      (sotto voce) He’s right, Frank.

Comments are closed.