Why large language models aren’t headed toward humanlike understanding

From Science News:

Apart from the northward advance of killer bees in the 1980s, nothing has struck as much fear into the hearts of headline writers as the ascent of artificial intelligence.

Ever since the computer Deep Blue defeated world chess champion Garry Kasparov in 1997, humans have faced the prospect that their supremacy over machines is merely temporary. Back then, though, it was easy to show that AI failed miserably in many realms of human expertise, from diagnosing disease to transcribing speech.

But then about a decade ago or so, computer brains — known as neural networks — received an IQ boost from a new approach called deep learning. Suddenly computers approached human ability at identifying images, reading signs and enhancing photographs — not to mention converting speech to text as well as most typists.

Those abilities had their limits. For one thing, even apparently successful deep learning neural networks were easy to trick. A few small stickers strategically placed on a stop sign made an AI computer think the sign said “Speed Limit 80,” for example. And those smart computers needed to be extensively trained on a task by viewing numerous examples of what they should be looking for. So deep learning produced excellent results for narrowly focused jobs but couldn’t adapt that expertise very well to other arenas. You would not (or shouldn’t) have hired it to write a magazine column for you, for instance.

But AI’s latest incarnations have begun to threaten job security not only for writers but also a lot of other professionals.

“Now we’re in a new era of AI,” says computer scientist Melanie Mitchell, an artificial intelligence expert at the Santa Fe Institute in New Mexico. “We’re beyond the deep learning revolution of the 2010s, and we’re now in the era of generative AI of the 2020s.”

Generative AI systems can produce things that had long seemed safely within the province of human creative ability. AI systems can now answer questions with seemingly human linguistic skill and knowledge, write poems and articles and legal briefs, produce publication quality artwork, and even create videos on demand of all sorts of things you might want to describe.

. . . .

“These things seem really smart,” Mitchell said this month in Denver at the annual meeting of the American Association for the Advancement of Science.

. . . .

At the heart of the debate is whether LLMs actually understand what they are saying and doing, rather than just seeming to. Some researchers have suggested that LLMs do understand, can reason like people (big deal) or even attain a form of consciousness. But Mitchell and others insist that LLMs do not (yet) really understand the world (at least not in any sort of sense that corresponds to human understanding).

“What’s really remarkable about people, I think, is that we can abstract our concepts to new situations via analogy and metaphor.”Melanie Mitchell

In a new paper posted online at arXiv.org, Mitchell and coauthor Martha Lewis of the University of Bristol in England show that LLMs still do not match humans in the ability to adapt a skill to new circumstances. Consider this letter-string problem: You start with abcd and the next string is abce. If you start with ijkl, what string should come next?

Humans almost always say the second string should end with m. And so do LLMs. They have, after all, been well trained on the English alphabet.

. . . .

“While humans exhibit high performance on both the original and counterfactual problems, the performance of all GPT models we tested degrades on the counterfactual versions,” Mitchell and Lewis report in their paper.

Other similar tasks also show that LLMs do not possess the ability to perform accurately in situations not encountered in their training. And therefore, Mitchell insists, they do not exhibit what humans would regard as “understanding” of the world.

“Being reliable and doing the right thing in a new situation is, in my mind, the core of what understanding actually means,” Mitchell said at the AAAS meeting.

Human understanding, she says, is based on “concepts” — basically mental models of things like categories, situations and events. Concepts allow people to infer cause and effect and to predict the probable results of different actions — even in circumstances not previously encountered.

“What’s really remarkable about people, I think, is that we can abstract our concepts to new situations via analogy and metaphor,” Mitchell said.

She does not deny that AI might someday reach a similar level of intelligent understanding. But machine understanding may turn out to be different from human understanding. Nobody knows what sort of technology might achieve that understanding and what the nature of such understanding might be.

If it does turn out to be anything like human understanding, it will probably not be based on LLMs.

Link to the rest at Science News and thanks to F. for the tip.

3 thoughts on “Why large language models aren’t headed toward humanlike understanding”

  1. Artificial human intelligence will have been achieved when these electronic contraptions begin confidently making assertions in areas where they have no knowledge or experience whatsoever.

    (Yes, I’m quite aware that this requires the assumption that politicians, and the majority of “experts” in academia are intelligent – and not lizard people.)

    • To put it another way: human inteligence allows *accurate* extrapolation from the general to the specific and from the specific to the general. LLMs can do the former but not the latter, hence they fail the OP test. Then again, so do a lot of humans.

      But AI that is only as good as the least competent humans isn’t much good in those cases. 😉

      However, the former can be very useful on its own.

      Consider this machine vision application:


      Industrial scale gardening: distinguishing weeds from produce and figuring out how to treat them individually is.
      Human understanding is not needed for software to be useful and trying for it is a waste of effort. It’s worth keeping this in mind when the pearl clutchers start whining about “AI”.

      Skynet won’t kill anybody but it will feed billions.

      • Another way of putting it:
        ML can be pretty good at interpolation, but sucks at extrapolation.
        Just like using fancy polynomials to fit curves.

Comments are closed.