20 votes

Covert racism in AI: How language models are reinforcing outdated stereotypes

2 comments

  1. [2]
    infpossibilityspace
    Link
    The more I read about how LLMs work and are made, the more skeptical I become that these models are a force for good in the world. In this case, I'm reminded of this quote from a video: This ties...

    The more I read about how LLMs work and are made, the more skeptical I become that these models are a force for good in the world. In this case, I'm reminded of this quote from a video:

    Taking the average bias of all writing will just give you the most popular biases [1]

    This ties back to the training data and how it's labelled and categorised, it doesn't matter how meticulous you are about pre- and post-prompt directions to the LLM if your training data is fundamentally flawed by containing systemically racist text.

    This includes problems with how training data is labelled - by fallible, usually underpaid humans, into an arbitrary category tree that necessarily can't represent every nuance in the data.

    It seems to me that the covert racism described here is a complex jumble of these factors and I'm not sure what the solution is, but I don't think it requires a technical one, rather a human one.

    [1] https://youtu.be/-MUEXGaxFDA (It's a long vid, skip to the "AI is not objective" chapter for the quote but it's all good)

    9 votes
    1. sparksbet
      Link Parent
      Yeah, while there are a lot of technical approaches to addressing AI bias (and sometimes they are somewhat effective!), ultimately the only real solution is actual societal change. There's only so...

      Yeah, while there are a lot of technical approaches to addressing AI bias (and sometimes they are somewhat effective!), ultimately the only real solution is actual societal change. There's only so much you can do to stop a pattern-finding machine from finding patterns that do actually exist in the data you're giving it -- and in our world, it's not possible to find real-world training data that doesn't reflect the biases and inequalities built into our society. With language models (even much older, simpler ones), all we can do is draw attention to their existence so people are aware of them and do our best to implement technical and human solutions that mitigate them.

      6 votes