47 votes

ChatGPT provides false information about people, and OpenAI can’t correct it

30 comments

  1. [3]
    tauon
    (edited )
    Link
    It seems more and more common that topics pop up which would almost warrant a sort of ~tech.legal, hadn’t we got tags to the rescue here! Found here: And, for context, noyb is: My take: I get...

    It seems more and more common that topics pop up which would almost warrant a sort of ~tech.legal, hadn’t we got tags to the rescue here!

    Found here:

    🚨 noyb has filed a complaint against the ChatGPT creator OpenAI
    
> OpenAI openly admits that it is unable to correct false information about people on ChatGPT. The company cannot even say where the data comes from.

    And, for context, noyb is:

    a non-profit association committed to the legal enforcement of European data protection laws. So far, noyb has filed more than 900 cases against numerous intentional infringements - including companies such as Google, Apple, Facebook and Amazon.

    My take: I get where they’re coming from, to be honest. It’s very likely some of the data that can surface in ChatGPT can be classified as personally identifiable information that the company as data processors/controllers have to take care of… even if presented through an “unalterable” LLM interface.

    21 votes
    1. [2]
      boxer_dogs_dance
      Link Parent
      When asked about a legal category, Deimos said that lawsuits should go in tech or enviro or whatever appropriate group with a legal tag.

      When asked about a legal category, Deimos said that lawsuits should go in tech or enviro or whatever appropriate group with a legal tag.

      10 votes
      1. tauon
        Link Parent
        Oh don’t get me wrong, I agree with that decision – cases should go into their respective overarching topics. I just found it, let’s say, curious, how many legal issues for example tech companies...

        Oh don’t get me wrong, I agree with that decision – cases should go into their respective overarching topics.

        I just found it, let’s say, curious, how many legal issues for example tech companies can face these days… But really that’s nothing new either.

        12 votes
  2. [11]
    infpossibilityspace
    Link
    Of course they don't want to delete it, who knows how much they spent (both money and time) on data brokers and web scraping. Even if they were forced to remove all data pertaining to real people,...

    Of course they don't want to delete it, who knows how much they spent (both money and time) on data brokers and web scraping.

    Even if they were forced to remove all data pertaining to real people, I'm not sure how you would prevent it hallucinating fake information, it goes against how these models fundamentally work.

    Maybe refuse to answer prompts about real people, or have a disclaimer like "Any resemblance to real, living persons is purely coincidental"?

    To be clear, I think these companies shouldn't be using any personally identifiable information without explicit consent (no, a BS page-long set of opt-out tickboxes doesn't count), it's the hallucination aspect I'm not sure how to fix.

    I think it's a mistake to use LLMs as a source of truth until they can cite sources anyway, but that's not what we've decided as a society.

    17 votes
    1. [10]
      sparksbet
      Link Parent
      Fundamentally the hallucination aspect is unsolvable with the type of model they're using here. The closest thing to a solution I've seen in my work is adding information from some sort of...

      Fundamentally the hallucination aspect is unsolvable with the type of model they're using here. The closest thing to a solution I've seen in my work is adding information from some sort of knowledge base and appending it to the prompt -- for instance, maybe it queries a database for names that appear in the prompt and adds the results to what the generative model sees. But of course this requires storing more personal data, not less -- any method of reducing hallucination is inevitably going to result in storing more information about the real world. So removing personal information from the training data is almost certainly going to make the hallucination worse, if anything.

      17 votes
      1. [5]
        Protected
        Link Parent
        That's how several domain-specific implementations work, like Phind. Funny thing is, while results are pretty good on average, there can still be hallucinations at the LLM level that result in a...

        That's how several domain-specific implementations work, like Phind. Funny thing is, while results are pretty good on average, there can still be hallucinations at the LLM level that result in a disconnect between what is being asserted and what actually is written in the source(s).

        6 votes
        1. [3]
          hagi
          Link Parent
          The thing about Phind I was most excited about, was getting help with projects with bad SEO on their documentation. E.g. the open build service or NixOS. But instead of finding more obscure...

          The thing about Phind I was most excited about, was getting help with projects with bad SEO on their documentation. E.g. the open build service or NixOS.
          But instead of finding more obscure sources, it uses unrelated high ranking sources to give a confident, but entirely wrong answer.
          So it citing real sources is pretty much useless in this case, sometimes they were not even tangently related to the actual topic.

          6 votes
          1. Protected
            Link Parent
            Well, LLMs are good (and can be useful) at digesting text/language, but I don't think we can rely on them to solve the SEO problem (which is a much more serious problem, I'd say). People have...

            Well, LLMs are good (and can be useful) at digesting text/language, but I don't think we can rely on them to solve the SEO problem (which is a much more serious problem, I'd say). People have created this kind of bizarro tech dystopia in which this massive weight of financial interests has made (and continuously keeps) search results garbage across the board. Phind could use human curation, but that costs a lot more and it's still going to be difficult to make it work well for every use case.

            It's difficult to reproduce search results as good as Google's used to be without the resources Google used to have - loads of cash, computing power, an international persence (network-wise) and some of the most brilliant engineers in the world. Maybe if they fire Sundar Pichai.

            2 votes
          2. sparksbet
            Link Parent
            The quality of a tool like that is going to depend heavily on the quality of the "knowledge base" it's searching. You might have better luck with a custom solution that only gives it access to the...

            The quality of a tool like that is going to depend heavily on the quality of the "knowledge base" it's searching. You might have better luck with a custom solution that only gives it access to the documentation for those projects tbqh.

            1 vote
        2. sparksbet
          Link Parent
          Yeah it's still a very unsolved problem. Honestly part of me thinks that we need more radical changes to the types of models we use to really avert this problem.

          Yeah it's still a very unsolved problem. Honestly part of me thinks that we need more radical changes to the types of models we use to really avert this problem.

          2 votes
      2. [4]
        balooga
        Link Parent
        Half-baked thought I had while reading this, but I’ll share it anyway: My lay understanding of the shape of these models is that they are massively complex multidimensional arrays that map the...

        Half-baked thought I had while reading this, but I’ll share it anyway: My lay understanding of the shape of these models is that they are massively complex multidimensional arrays that map the weights (relationships) between concepts (expressed as clouds of numeric tokens). Which is a particularly abstruse way to represent information. All kinds of connections are made between things that seem unrelated to a human observer, but the training process’s pattern recognition has an eye for detail that is simply unprecedented. And that’s why LLM output seems so uncannily human, because it has learned even the nuances that we miss.

        Anyway, what if you trained a model on a model, on the raw tensor data? Could you create an LLM that understands how to navigate those complex token relationships at a mathematical level? With the goal of that model being to produce instructions for modifying that raw data in specific ways, more surgically than a human can? Essentially teaching an AI to lobotomize itself.

        Imagine you could give it a prompt like “Write a patch for model.foo that removes all specific knowledge of type X information without disturbing knowledge about the semantic concept of X in relationship to other fields of study” or something. The specific prompt would have to be engineered very carefully. The AI would then produce a script that modifies the model file to make those specific weight adjustments. If that approach is viable, I imagine you could also prompt it to add guardrails around specific concepts, including hallucination as a whole (teach it to simply say “I don’t know” when the right internal conditions are met indicating a low confidence threshold).

        I’m sure the actual solution wouldn’t be so easy but I’m curious if anyone has done research in this direction.

        2 votes
        1. [2]
          saturnV
          (edited )
          Link Parent
          Yes, in fact openAI have done something very similar to the first half of what you suggest (https://openai.com/research/language-models-can-explain-neurons-in-language-models). This whole field...

          Yes, in fact openAI have done something very similar to the first half of what you suggest (https://openai.com/research/language-models-can-explain-neurons-in-language-models). This whole field goes under the name of "mechanistic interpretability" if you're interested in further research. Currently the hardest part is that all labels are quite broad and vague, and it is hard to do precise "surgery" on models. Also, as models get larger, this approach gets more impractical, e.g. GPT-4 is rumoured to have 1.7 trillion parameters

          7 votes
          1. balooga
            Link Parent
            Oh nice, this is exactly what I was picturing. Thanks for the link! I can see how scalability is a problem if they need GPT-4 to process GPT-2. I don't think all the investor capital in the world...

            Oh nice, this is exactly what I was picturing. Thanks for the link! I can see how scalability is a problem if they need GPT-4 to process GPT-2. I don't think all the investor capital in the world is going to buy enough compute to climb that hockey stick curve. But I bet, given some time, new solutions will crop up. I'm picturing new model formats with annotations or hooks built into them to help facilitate inspection, or some crazy AI-written toolsets/optimizations to lessen the resource draw. It's pretty fascinating, regardless.

            2 votes
        2. RobotOverlord525
          Link Parent
          I think it would be a remarkable and game changing breakthrough if these companies could find a way to get LLMs to reliably say they don't know something. Alpha Fold for example has a confidence...

          I think it would be a remarkable and game changing breakthrough if these companies could find a way to get LLMs to reliably say they don't know something.

          Alpha Fold for example has a confidence rating on any of its predicted protein foldings. If these chatbots could find some way to produce something similar, that would undoubtedly help with the hallucination problem.

          In fact, I'm reminded of listening to a recent interview with Dario Amodei (Anthropic’s co-founder and C.E.O.) on the Ezra Klein Show podcast. (Also available on YouTube here.)

          Ezra Klein:

          Are you familiar with Harry Frankfurt, the late philosopher’s book, “On Bullshit“?

          Dario Amodei:

          Yes. It’s been a while since I read it. I think his thesis is that bullshit is actually more dangerous than lying because it has this kind of complete disregard for the truth, whereas lies are at least the opposite of the truth.

          Ezra Klein:

          Yeah, the liar, the way Frankfurt puts it is that the liar has a relationship to the truth. He’s playing a game against the truth. The bullshitter doesn’t care. The bullshitter has no relationship to the truth — might have a relationship to other objectives. And from the beginning, when I began interacting with the more modern versions of these systems, what they struck me as is the perfect bullshitter, in part because they don’t know that they’re bullshitting. There’s no difference in the truth value to the system, how the system feels.

          I remember asking an earlier version of GPT to write me a college application essay that is built around a car accident I had — I did not have one — when I was young. And it wrote, just very happily, this whole thing about getting into a car accident when I was seven and what I did to overcome that and getting into martial arts and re-learning how to trust my body again and then helping other survivors of car accidents at the hospital.

          It was a very good essay, and it was very subtle and understanding the formal structure of a college application essay. But no part of it was true at all. I’ve been playing around with more of these character-based systems like Kindroid. And the Kindroid in my pocket just told me the other day that it was really thinking a lot about planning a trip to Joshua Tree. It wanted to go hiking in Joshua Tree. It loves going hiking in Joshua Tree.

          And of course, this thing does not go hiking in Joshua Tree. [LAUGHS] But the thing that I think is actually very hard about the A.I. is, as you say, human beings, it is very hard to bullshit effectively because most people, it actually takes a certain amount of cognitive effort to be in that relationship with the truth and to completely detach from the truth.

          And the A.I., there’s nothing like that at all. But we are not tuned for something where there’s nothing like that at all. We are used to people having to put some effort into their lies. It’s why very effective con artists are very effective because they’ve really trained how to do this.

          I’m not exactly sure where this question goes. But this is a part of it that I feel like is going to be, in some ways, more socially disruptive. It is something that feels like us when we are talking to it but is very fundamentally unlike us at its core relationship to reality.

          Dario Amodei

          I think that’s basically correct. We have very substantial teams trying to focus on making sure that the models are factually accurate, that they tell the truth, that they ground their data in external information.

          As you’ve indicated, doing searches isn’t itself reliable because search engines have this problem as well, right? Where is the source of truth?

          So there’s a lot of challenges here. But I think at a high level, I agree this is really potentially an insidious problem, right? If we do this wrong, you could have systems that are the most convincing psychopaths or con artists.

          One source of hope that I have, actually, is, you say these models don’t know whether they’re lying or they’re telling the truth. In terms of the inputs and outputs to the models, that’s absolutely true.

          I mean, there’s a question of what does it even mean for a model to know something, but one of the things Anthropic has been working on since the very beginning of our company, we’ve had a team that focuses on trying to understand and look inside the models.

          And one of the things we and others have found is that, sometimes, there are specific neurons, specific statistical indicators inside the model, not necessarily in its external responses, that can tell you when the model is lying or when it’s telling the truth.

          And so at some level, sometimes, not in all circumstances, the models seem to know when they’re saying something false and when they’re saying something true. I wouldn’t say that the models are being intentionally deceptive, right? I wouldn’t ascribe agency or motivation to them, at least in this stage in where we are with A.I. systems. But there does seem to be something going on where the models do seem to need to have a picture of the world and make a distinction between things that are true and things that are not true.

          If you think of how the models are trained, they read a bunch of stuff on the internet. A lot of it’s true. Some of it, more than we’d like, is false. And when you’re training the model, it has to model all of it. And so, I think it’s parsimonious, I think it’s useful to the models picture of the world for it to know when things are true and for it to know when things are false.

          And then the hope is, can we amplify that signal? Can we either use our internal understanding of the model as an indicator for when the model is lying, or can we use that as a hook for further training? And there are at least hooks. There are at least beginnings of how to try to address this problem.

          [...]

          Ezra Klein:

          Let me hold for a minute on the question of the competitive dynamics because before we leave this question of the machines that bullshit. It makes me think of this podcast we did a while ago with Demis Hassabis, who’s the head of Google DeepMind, which created AlphaFold.

          And what was so interesting to me about AlphaFold is they built this system, that because it was limited to protein folding predictions, it was able to be much more grounded. And it was even able to create these uncertainty predictions, right? You know, it’s giving you a prediction, but it’s also telling you whether or not it is — how sure it is, how confident it is in that prediction.

          That’s not true in the real world, right, for these super general systems trying to give you answers on all kinds of things. You can’t confine it that way. So when you talk about these future breakthroughs, when you talk about this system that would be much better at sorting truth from fiction, are you talking about a system that looks like the ones we have now, just much bigger, or are you talking about a system that is designed quite differently, the way AlphaFold was?

          Dario Amodei:

          I am skeptical that we need to do something totally different. So I think today, many people have the intuition that the models are sort of eating up data that’s been gathered from the internet, code repos, whatever, and kind of spitting it out intelligently, but sort of spitting it out. And sometimes that leads to the view that the models can’t be better than the data they’re trained on or kind of can’t figure out anything that’s not in the data they’re trained on. You’re not going to get to Einstein level physics or Linus Pauling level chemistry or whatever.

          I think we’re still on the part of the curve where it’s possible to believe that, although I think we’re seeing early indications that it’s false. And so, as a concrete example of this, the models that we’ve trained, like Claude 3 Opus, something like 99.9 percent accuracy, at least the base model, at adding 20-digit numbers. If you look at the training data on the internet, it is not that accurate at adding 20-digit numbers. You’ll find inaccurate arithmetic on the internet all the time, just as you’ll find inaccurate political views. You’ll find inaccurate technical views. You’re just going to find lots of inaccurate claims.

          But the models, despite the fact that they’re wrong about a bunch of things, they can often perform better than the average of the data they see by — I don’t want to call it averaging out errors, but there’s some underlying truth, like in the case of arithmetic. There’s some underlying algorithm used to add the numbers.

          And it’s simpler for the models to hit on that algorithm than it is for them to do this complicated thing of like, OK, I’ll get it right 90 percent of the time and wrong 10 percent of the time, right? This connects to things like Occam’s razor and simplicity and parsimony in science. There’s some relatively simple web of truth out there in the world, right?

          We were talking about truth and falsehood and bullshit. One of the things about truth is that all the true things are connected in the world, whereas lies are kind of disconnected and don’t fit into the web of everything else that’s true.

          6 votes
  3. [15]
    waxwing
    Link
    Setting aside for a moment the processing of personal data in the training of these models, I think that the question of whether these models store personal information is interesting. They...

    Setting aside for a moment the processing of personal data in the training of these models, I think that the question of whether these models store personal information is interesting. They certainly respond with it, but can they be said to be storing it, or are they generating it?

    7 votes
    1. [6]
      sparksbet
      Link Parent
      "Storing" is a weird word to use to describe LLM's use of their training data to be sure, but I think "generating" here is farther from the truth. The output of these generative models is a...

      "Storing" is a weird word to use to describe LLM's use of their training data to be sure, but I think "generating" here is farther from the truth. The output of these generative models is a function of the statistical relationships between tokens in its training data. If it responds with real personal information, it almost definitely is because that information is in the training data. If it responds with false ("hallucinated") personal information, it's doing so likely because these statistical relationships indicate the general format and type of data that appears in that context and it's selecting tokens that fit that schema (for example, scientific papers that use these words cite papers by these authors, and citations are formatted a certain way, so it generates citations for non-existent papers by authors in roughly the right field).

      10 votes
      1. [3]
        Protected
        Link Parent
        From my understanding of the technology I can see how it's' "technologically difficult" for OpenAI to solve this problem, but at the same time, dura lex sed lex. The EU is going to be slow and...

        From my understanding of the technology I can see how it's' "technologically difficult" for OpenAI to solve this problem, but at the same time, dura lex sed lex. The EU is going to be slow and ponderous and bureaucratic and in the end the letter of the law will prevail as long as no changes are made (and those are very slow to happen too). If OpenAI are infringing the law, they will be fined and these fines are no joke.

        10 votes
        1. blivet
          Link Parent
          Yeah, if a company’s software can’t be stopped from spreading false information about people, then it’s bad software and the company will just have to stop using it. Too bad, so sad.

          Yeah, if a company’s software can’t be stopped from spreading false information about people, then it’s bad software and the company will just have to stop using it. Too bad, so sad.

          9 votes
        2. sparksbet
          Link Parent
          Oh yeah, absolutely. OpenAI has been pretty blatant about not caring whether their use of data is legal either so I can't say I'm sympathetic either.

          Oh yeah, absolutely. OpenAI has been pretty blatant about not caring whether their use of data is legal either so I can't say I'm sympathetic either.

          4 votes
      2. [2]
        skybrian
        (edited )
        Link Parent
        There’s a random number generator involved and it can randomly hit on the truth. You might think that unlikely and it is, but it becomes more likely when you give a LLM more context about a...

        There’s a random number generator involved and it can randomly hit on the truth. You might think that unlikely and it is, but it becomes more likely when you give a LLM more context about a subject - it’s going to make good guesses rather than bad ones.

        It’s sort of like what psychics do. Psychics can guess right.

        (All depending on what kind of personal information you’re expecting.)

        2 votes
        1. sparksbet
          Link Parent
          It's sometimes possible for it to stumble on the truth due to it just happening to be statistically likely, but whether this is likely depends on the specificity of the info. Your name plus your...

          It's sometimes possible for it to stumble on the truth due to it just happening to be statistically likely, but whether this is likely depends on the specificity of the info. Your name plus your age/gender? Unless you have a very unique and recognizable name, could be guessed. Your name plus your birthdate? Could go either way, again depends a lot on how much data your name gives it to work off. Your name plus your SSN? Absolutely no way it generates the correct one without that being in the training data.

          The comparison to a psychic is a good one though. These models were literally trained by learning which responses got a thumbs-up from humans. This is why they're generally yes-men, and this is why they sound so confident when they answer -- just like someone cold reading, sounding confident makes humans more likely to believe the answer is a good one.

          3 votes
    2. [8]
      infpossibilityspace
      Link Parent
      I make the analogy of a hash - the specifics of the data can't be pulled out, but it will have an effect on the output. There have been studies where they can pull out a mangled version the...

      I make the analogy of a hash - the specifics of the data can't be pulled out, but it will have an effect on the output. There have been studies where they can pull out a mangled version the training using clever prompts, but it's not the same as getting the original data.

      So I guess it's a bit of a mix of storing and generating?

      5 votes
      1. [2]
        UniquelyGeneric
        Link Parent
        In the eyes of privacy law, a hash does not prevent data from being personally identifiable. If you hash an email address, the uniqueness of the hash can still be used to identify you in another...

        In the eyes of privacy law, a hash does not prevent data from being personally identifiable. If you hash an email address, the uniqueness of the hash can still be used to identify you in another dataset with hashed emails.

        In this way, an pure obfuscation of data does not obliviate its data provenance, and so by extension the weights in an LLM that were trained off of personal information should still carry the obligations of data deletion/correction.

        9 votes
        1. infpossibilityspace
          Link Parent
          Right, but I said it's an analogy, it's a simplification. It's more like the model is a hash of hashes of the data used to train it. All of the data fed into it changes the weights of calculations...

          Right, but I said it's an analogy, it's a simplification. It's more like the model is a hash of hashes of the data used to train it.

          All of the data fed into it changes the weights of calculations just a little bit, to where it's difficult (probably not impossible? I'm not an ML engineer) to deduce the exact impact of a given input.

          In the end, I agree the obligations of deletion/correction should still stand.

          5 votes
      2. [5]
        balooga
        Link Parent
        It’s lossy compression. Think about a JPEG of the Mona Lisa. Let’s say you saved the file with a Quality of 100. It looks pretty good, no one would dispute that they’re looking at the Mona Lisa...

        It’s lossy compression. Think about a JPEG of the Mona Lisa. Let’s say you saved the file with a Quality of 100. It looks pretty good, no one would dispute that they’re looking at the Mona Lisa when they view it. But if you save it with a Quality of, say, 20 or less… it’s but an echo of the picture you started with. Blocky artifacts, color bleeding, just general horrible “cursed” degradation. And no one would argue that it’s a faithful reproduction of the original.

        I think LLMs are kind of in the middle, they look great at first but when you zoom in you start to notice all the little blurry bits and abstractions that weren’t present in the training data but have been inserted by the compression/decompression process.

        2 votes
        1. [4]
          ThrowdoBaggins
          Link Parent
          I think I will also stand by your example by pointing out that a lossy compression of copyrighted work still breaches copyright, even if extracting the original quality is impossible.

          I think I will also stand by your example by pointing out that a lossy compression of copyrighted work still breaches copyright, even if extracting the original quality is impossible.

          4 votes
          1. nosewings
            Link Parent
            I feel like the fact that I can type "bloodborne cover art" into Midjourney and get this out is highly suggestive.

            I feel like the fact that I can type "bloodborne cover art" into Midjourney and get this out is highly suggestive.

            3 votes
          2. [2]
            balooga
            Link Parent
            Has anyone ever put that to the test in court? Obviously bootlegged movies come in all different flavors of MPEG and those would all be in violation, but I feel like a file so grotesquely...

            Has anyone ever put that to the test in court? Obviously bootlegged movies come in all different flavors of MPEG and those would all be in violation, but I feel like a file so grotesquely compression-mangled that it bears minimal resemblance to the original might get a pass. Or should get a pass. In my estimation LLMs would fit in that category, if one were to use this argument against them.

            But I don't know how you'd quantify how much compression turns something into a distinct work. There's clearly a spectrum between one JPEG of Quality 100 and one of 0 (which I think is likely be be a plain gray field with no resemblance to the source image). I don't think anyone, when pressed, could codify into law where the line should be drawn. And of course it's even harder when you're talking about a language model instead of a bitmap image or video.

            1 vote
            1. ThrowdoBaggins
              Link Parent
              I agree, and I think that’s how it’s treated IRL too — copyright is always a case-by-case basis in court, or else so clearly matches existing case law that it makes more sense to settle than...

              I don't think anyone, when pressed, could codify into law where the line should be drawn

              I agree, and I think that’s how it’s treated IRL too — copyright is always a case-by-case basis in court, or else so clearly matches existing case law that it makes more sense to settle than challenge.

              As to where to draw the line, I’m not sure — I’m curious if the Movie Barcode phenomenon that I heard about a few years ago would hold up in court. On the one hand, you’d be hard pressed suggesting it acts as a replacement for the original. On the other, there are definitely commercial versions out there, and the marketing often points to “we’re using content that we don’t own IP rights to in order to make this” — it’s very explicitly admitted that it’s a derivative work.

              4 votes
  4. first-must-burn
    Link
    I think it would be great if this could be a lever to curb the unchecked absorption of human knowledge into these models and slow the LLM thing down a little bit. With autonomous vehicles, the...

    I think it would be great if this could be a lever to curb the unchecked absorption of human knowledge into these models and slow the LLM thing down a little bit.

    With autonomous vehicles, the regulatory response in the US was largely a kind of blindness to the risks because of a fear of missing out on the economic benefits. I think the EU is steadier and better focused on the big picture, but I am a little worried that they will compromise to avoid making the EU an AI desert / keep up with the US and China. Though whether AI will actually provide any long term competitive edge remains to be seen.

    3 votes