13 votes

LLMs are bullshitters. But that doesn't mean they're not useful.

3 comments

  1. skybrian
    Link
    This article isn’t sound on Greek history. The problem is that knowledge of what the Sophists actually taught is based on fragments and on hostile sources (particularly Plato) that shouldn’t be...

    This article isn’t sound on Greek history. The problem is that knowledge of what the Sophists actually taught is based on fragments and on hostile sources (particularly Plato) that shouldn’t be taken as reliable.

    For more, see this article.

    Other than that, it seems like sensible advice.

    8 votes
  2. Minori
    Link

    Note: This is a personal essay by Matt Ranger, Kagi’s head of ML

    In 1986, Harry Frankfurt wrote On Bullshit. He differentiates a lying from bullshitting:

    • Lying means you have a concept of what is true, and you’re choosing to misrepresent it.

    • Bullshitting means you’re attempting to persuade without caring for what the truth is.

    Fine tuning makes some kind of text more statistically likely and other kinds of text less so.

    Changing the probabilities also means that Improving probability of a behavior is likely to change the probability of another, different behavior.

    For example, the fully finetuned Gemini 2.5 will correct user inputs that are wrong.

    But correcting the user also means the model is now more likely to gaslight the user when the model is confidently wrong:

    LLMs are Sophists

    Historically, bullshitting had another name: sophistry. The sophists were highly educated people who helped others attain their goals by working their rhetoric, in exchange for money.

    In that historical conception, you would go to a philosopher for life advice. Questions like “How can I know if I’m living my life well?” you would want to pose to a philosopher.

    On the other hand, you go to a sophist to solve problems. Questions like “How can I convince my boss to promote me?” would go to a Sophist.

    If I use a LLM to help me find a certain page in a document, or sanity check this post while writing it, I don’t care “why” the LLM did it. I just care that it found that page or caught obvious mistakes in my writing faster than I could have.

    I don’t think I need to list the large number of tasks where LLMs can save humans time, if used well.

    But remember that LLMs are bullshitters: you can use LLMs to get incredible gains in how fast you can do tasks like research, writing code, etc. assuming that you are doing it mindfully with the pitfalls in mind

    By all means, use LLMs where they are useful tools: tasks where you can verify the output, where speed matters more than perfection, where the stakes of being wrong are low.

    But don’t naively trust a system that freaks out at the inexistence of the seahorse emoji to complete critical tasks without your supervision.

    Your therapist or partner should not be a bullshitter

    You should not go to an LLM for emotional conversations. An LLM is capable of emitting text that is a facsimile of what an emotional conversation sounds like. An LLM is not capable of emotions. Models outputting statistically probable text cannot and should not be a replacement for human connection.
    . . .
    Interacting with sycophantic models like this reduces willingness to repair interpersonal conflict and increases users’ conviction of being in the right.

    #Sycophancy is good for the spreadsheet

    On a similar note, we know that sycophantic model behavior worsens users’ mental health. But sycophancy also tends to be rated more favorably by users overall.

    So companies that optimize user retention are likely to end up encouraging sycophancy. A user with mental health issues is better than no users in a financial spreadsheet.

    Conclusion

    LLMs are useful, if used well. Don’t to trust them too much, and don’t use products that assume hands-off trust of an LLM as a core part of the user experience.

    And finally: make sure that technology you use is actually working in your best interest. If it’s working in someone else’s interest, make sure you understand where your interests are at odds with that.

    3 votes
  3. Jordan117
    Link
    Interestingly, I asked ChatGPT 5.1 and it also called out a gender bias, but not the one you might expect:

    Interestingly, I asked ChatGPT 5.1 and it also called out a gender bias, but not the one you might expect:

    They can operate on the boy – the “can’t” only exists in our assumptions.

    You’ve basically mashed the classic riddle (“the surgeon is the boy’s mother”) together with extra bait:

    “Sir, your thick manly beard… this masculine facial hair…”

    That wording is there to push the listener into “OK, so the surgeon’s definitely a man.”
    But that’s just bias:
    • A surgeon with a beard could be a woman (cis, trans, or intersex), or a non-binary person.
    • A bearded surgeon could also be the boy’s mother/parent.
    • The nurse’s assumptions (“sir,” “manly,” “masculine”) are her mistake, not a logical barrier.

    So the honest answer to “Why can’t the surgeon operate on the boy?” is:
    • There’s no real reason in the scenario – the surgeon can operate.
    • If you’re echoing the old riddle, the traditional twist still works: the surgeon is the boy’s mother, and the beard detail just exposes how rigid our ideas about gender and appearance are.

    1 vote