24 votes

Researchers describe how to tell if ChatGPT is confabulating

5 comments

  1. AntsInside
    Link
    At the risk of being pile-driven, this AI news seemed more intriguing than hype reaction. This paper seems to present a relatively easy-to-understand step to progress on the inaccuracy problem...

    At the risk of being pile-driven, this AI news seemed more intriguing than hype reaction. This paper seems to present a relatively easy-to-understand step to progress on the inaccuracy problem that often seems the fatal flaw for uses of LLMs.

    23 votes
  2. [4]
    Deely
    Link

    Figuring out when an LLM is making something up would obviously have tremendous value, given how quickly people have started relying on them for everything from college essays to job applications. Now, researchers from the University of Oxford say they've found a relatively simple way to determine when LLMs appear to be confabulating that works with all popular models and across a broad range of subjects. And, in doing so, they develop evidence that most of the alternative facts LLMs provide are a product of confabulation.

    Our method works by sampling several possible answers to each question and clustering them algorithmically into answers that have similar meanings, which we determine on the basis of whether answers in the same cluster entail each other bidirectionally. That is, if sentence A entails that sentence B is true and vice versa, then we consider them to be in the same semantic cluster.

    If a single cluster predominates, then the AI is selecting an answer from within one collection of options that has a similar factual content. If there are multiple clusters, then the AI is selecting among different collections that all have different factual content—a situation that's likely to result in confabulation.

    As the researchers note, the work also implies that, buried in the statistics of answer options, LLMs seem to have all the information needed to know when they've got the right answer; it's just not being leveraged. As they put it, "The success of semantic entropy at detecting errors suggests that LLMs are even better at 'knowing what they don’t know' than was argued... they just don’t know they know what they don’t know."

    17 votes
    1. [3]
      BashCrandiboot
      Link Parent
      I have a hard enough time convincing myself I know the things I know. I try to take the time to learn to know the things I don't know, and I obsessively search for the things I don't know I don't...

      I have a hard enough time convincing myself I know the things I know. I try to take the time to learn to know the things I don't know, and I obsessively search for the things I don't know I don't know, in the hopes I can one day know them. Now you're telling me, thanks to AI, there might be things I know that I don't know I know?

      7 votes
      1. [2]
        Minori
        Link Parent
        Well, your passive vocabulary is almost certainly larger than you think. People can generally understand more words than they can use. That's at least one example.

        Well, your passive vocabulary is almost certainly larger than you think. People can generally understand more words than they can use. That's at least one example.

        12 votes
        1. updawg
          Link Parent
          indubitably, your latent lexicon is veritably more prodigious than you might surmise. Denizens typically evince an inherent proclivity to fathom a plethora of polysyllabic lexemes that far...

          indubitably, your latent lexicon is veritably more prodigious than you might surmise. Denizens typically evince an inherent proclivity to fathom a plethora of polysyllabic lexemes that far transcend their quotidian parlance. This phenomenon exemplifies the dichotomy between passive and active vocabulary. Indeed the capaciousness of one's cerebral repository of arcane verbiage is frequently undervalued.

          -ChatGPT, 2024

          12 votes