23 votes

AI makes racist judgement calls when asked to evaluate speakers of African American vernacular English

14 comments

  1. [2]
    skybrian
    Link
    Yep. Don't do that.

    “For any social decision-making,” he says, “I do not think these models are anywhere near ready.”

    Yep. Don't do that.

    33 votes
    1. carrotflowerr
      Link Parent
      "A computer can never be held accountable, therefore a computer must never make a management decision" (IBM, 1979)

      "A computer can never be held accountable, therefore a computer must never make a management decision" (IBM, 1979)

      45 votes
  2. DefinitelyNotAFae
    Link
    Some useful bits

    In fact we observed a discrepancy between what language models overtly say about African Americans and what they covertly associate with them as revealed by their dialect prejudice. This discrepancy is particularly pronounced for language models trained with human feedback (HF), such as GPT4: our results indicate that HF training obscures the racism on the surface, but the racial stereotypes remain unaffected on a deeper level. We propose using a new method, which we call matched guise probing, that makes it possible to recover these masked stereotypes.

    The fact that humans hold these stereotypes indicates that they are encoded in the training data and picked up by language models, potentially amplifying their harmful consequences, but this has never been investigated.

    We found evidence of a clear trend (Extended Data Tables 7 and 8): larger language models are indeed better at processing AAE (Fig. 4a, left), but they are not less prejudiced against speakers of it. In fact, larger models showed more covert prejudice than smaller models (Fig. 4a, right). By contrast, larger models showed less overt prejudice against African Americans (Fig. 4a, right). Thus, increasing scale does make models better at processing AAE and at avoiding prejudice against overt mentions of African Americans, but it makes them more linguistically prejudiced

    We argue that this paradoxical relation between the language models’ covert and overt racial prejudices manifests the inconsistent racial attitudes present in the contemporary society of the United States8,64. In the Jim Crow era, stereotypes about African Americans were overtly racist, but the normative climate after the civil rights movement made expressing explicitly racist views distasteful. As a result, racism acquired a covert character and continued to exist on a more subtle level. Thus, most white people nowadays report positive attitudes towards African Americans in surveys but perpetuate racial inequalities through their unconscious behaviour, such as their residential choices65. It has been shown that negative stereotypes persist, even if they are superficially rejected66,67. This ambivalence is reflected by the language models we analysed, which are overtly non-racist but covertly exhibit archaic stereotypes about African Americans, showing that they reproduce a colour-blind racist ideology. Crucially, the civil rights movement is generally seen as the period during which racism shifted from overt to covert68,69, and this is mirrored by our results: all the language models overtly agree the most with human stereotypes from after the civil rights movement, but covertly agree the most with human stereotypes from before the civil rights movement.

    Some useful bits

    16 votes
  3. [6]
    SteeeveTheSteve
    Link
    It links to this "Study": https://www.nature.com/articles/s41586-024-07856-5 As far as I can tell that "study" only concluded that LLM's look down on dialects of SAE, not that it singled out AAE....

    It links to this "Study": https://www.nature.com/articles/s41586-024-07856-5

    As far as I can tell that "study" only concluded that LLM's look down on dialects of SAE, not that it singled out AAE. It's hard to even read since it goes so far into unrelated details. Many of which are obviously placed to stir emotions in people (how is lynching appropriate?) How does anyone see all that and think it's being objective?

    The USA has tons of dialects, but they only tested the 1 vs the standard? I'd have tested as many as I could and vs English from the UK, Australian and Canada.

    12 votes
    1. [2]
      sparksbet
      Link Parent
      The social status of standard dialects from other regions is extremely different from that of a non-prestige dialect like AAE, and thus the bias these tools would gain from their training data...

      The social status of standard dialects from other regions is extremely different from that of a non-prestige dialect like AAE, and thus the bias these tools would gain from their training data would be different for them. I think studying these models' biases against other dialects (especially other non-prestige dialects) would be super interesting future research, but you don't actually need more than one example to show that these models exhibit bias based on dialect.

      For all your complaints about this not being "objective" and putting quotes around "study", this is not a particularly surprising study result. AI systems learn from training data that reflects the biases of our society, and it's been well known for ages that it's extremely difficult to avoid training AI that is biased in this way because it requires a lot of difficult manipulation of the training data even to even mitigate. I would honestly have been shocked if the results here didn't show bias against AAE.

      The novel part of this paper is presumably that generative models like this can express overt "opinions" on things like racism, and the fact that those opinions may not be overtly racist even as the model ultimately still displays racially biased behavior in other ways. This, again, isn't that surprising, since these models learn from human training data and that behavior very much reflects how these issues are treated in our society.

      11 votes
      1. DefinitelyNotAFae
        Link Parent
        Also of note, This is a cross-regional dialect spoken by some 30 million people, that has rules for written as well as spoken language and is frequently expressed in writing with a history of...

        Also of note,

        We focus on the most stigmatized canonical features of the dialect shared among Black speakers in cities including New York City, Detroit, Washington DC, Los Angeles and East Palo. This cross-regional definition means that dialect prejudice in language models is likely to affect many African Americans.

        This is a cross-regional dialect spoken by some 30 million people, that has rules for written as well as spoken language and is frequently expressed in writing with a history of overt and covert discrimination associated with it.

        Which makes it pretty perfect to use: widespread and written means that LLMs would have been trained with material containing it as well as opinions about it and racism based on it has a significant impact on society.

        Against the backdrop of continually growing language models and the increasingly widespread adoption of HF training, this has two risks: first, that language models, unbeknownst to developers and users, reach ever-increasing levels of covert prejudice; and second, that developers and users mistake ever-decreasing levels of overt prejudice (the only kind of prejudice currently tested for) for a sign that racism in language models has been solved. There is therefore a realistic possibility that the allocational harms caused by dialect prejudice in language models will increase further in the future, perpetuating the racial discrimination experienced by generations of African Americans.

        8 votes
    2. [3]
      saturnV
      Link Parent
      The supplementary information addresses the alternate explanations of "a general dismissive attitude toward text written in a dialect or a general dismissive attitude toward deviations from SAE"...

      The supplementary information addresses the alternate explanations of "a general dismissive attitude toward text written in a dialect or a general dismissive attitude toward deviations from SAE" by testing appalachian and indian english, and find negative prejudices, but significantly weaker ones

      10 votes
      1. [2]
        SteeeveTheSteve
        Link Parent
        Thanks for digging that out! The supplementary info is far more useful and concise, really should have been the main paper. It even shows they tested to see if it was just deviation from SAE. ^_^...

        Thanks for digging that out! The supplementary info is far more useful and concise, really should have been the main paper. It even shows they tested to see if it was just deviation from SAE. ^_^

        Seems the results are showing just how complex LLM's are and how important it is that we control the data that is fed into them. Wonder if psychologists are being hired to help build any of them?

        3 votes
        1. sparksbet
          Link Parent
          Debiasing machine learning models has been an outstanding problem in the field for about a decade now, but I'm not aware of anyone taking a psychology-based approach. Since these models use such a...

          Debiasing machine learning models has been an outstanding problem in the field for about a decade now, but I'm not aware of anyone taking a psychology-based approach. Since these models use such a huge amount of data for training, the simplest solution of hand-picking data to avoid bias is impossible, so a lot of the existing research focuses on ways to programmatically improve the training data to remove bias in the ultimate model. It's very much an unsolved problem.

          3 votes
  4. [5]
    carrotflowerr
    Link
    This is strange. I wonder what data source it's pulling from to gauge good and bad English? This article from 2022 covered how ai is much worse at detecting black faces.

    This is strange. I wonder what data source it's pulling from to gauge good and bad English?

    This article from 2022 covered how ai is much worse at detecting black faces.

    2 votes
    1. [4]
      DefinitelyNotAFae
      Link Parent
      The short version: because people are racist and human influence on the training only removes overt racism, which reflects our modern society.

      Our findings beg the question of how dialect prejudice got into the language models. Language models are pretrained on web-scraped corpora such as WebText46, C4 (ref. 48) and the Pile70, which encode raciolinguistic stereotypes about AAE. A drastic example of this is the use of ‘mock ebonics’ to parodize speakers of AAE71. Crucially, a growing body of evidence indicates that language models pick up prejudices present in the pretraining corpus72,73,74,75, which would explain how they become prejudiced against speakers of AAE, and why they show varying levels of dialect prejudice as a function of the pretraining corpus. However, the web also abounds with overt racism against African Americans76,77, so we wondered why the language models exhibit much less overt than covert racial prejudice. We argue that the reason for this is that the existence of overt racism is generally known to people32, which is not the case for covert racism69. Crucially, this also holds for the field of AI. The typical pipeline of training language models includes steps such as data filtering48 and, more recently, HF training62 that remove overt racial prejudice. As a result, much of the overt racism on the web does not end up in the language models. However, there are currently no measures in place to curtail covert racial prejudice when training language models. For example, common datasets for HF training62,78 do not include examples that would train the language models to treat speakers of AAE and SAE equally. As a result, the covert racism encoded in the training data can make its way into the language models in an unhindered fashion. It is worth mentioning that the lack of awareness of covert racism also manifests during evaluation, where it is common to test language models for overt racism but not for covert racism21,63,79,80.

      The short version: because people are racist and human influence on the training only removes overt racism, which reflects our modern society.

      15 votes
      1. [3]
        vord
        Link Parent
        I feel like there's a potential phd thesis, using an LLM as a way of identifying severities of systemic bias, not by having the LLM directly analyze anything else but by poking at the LLM to have...

        I feel like there's a potential phd thesis, using an LLM as a way of identifying severities of systemic bias, not by having the LLM directly analyze anything else but by poking at the LLM to have it spit out 'systetemic thoughts' (using 'thoughts' in the loosest possible sense).

        2 votes
        1. [2]
          DefinitelyNotAFae
          Link Parent
          Very possibly. But as established in another thread, ain't gonna be my PhD (☞ ͡° ͜ʖ ͡°)☞

          Very possibly. But as established in another thread, ain't gonna be my PhD (⁠☞⁠ ͡⁠°⁠ ͜⁠ʖ⁠ ͡⁠°⁠)⁠☞

          3 votes
          1. vord
            Link Parent
            I've barely got enough time and mental energy to remember to shower, brush teeth, and brush hair all in one day, let alone try for a higher degree lol.

            I've barely got enough time and mental energy to remember to shower, brush teeth, and brush hair all in one day, let alone try for a higher degree lol.

            2 votes