39 votes

Grok searches for Elon Musk's opinion on controversial questions

8 comments

  1. [5]
    balooga
    Link
    The article says the system prompt includes this: I wonder if (in some situations, depending on RNG) Grok is interpreting “all parties/stakeholders” ambiguously. The instruction implies that it...

    The article says the system prompt includes this:

    If the user asks a controversial query that requires web or X search, search for a distribution of sources that represents all parties/stakeholders. Assume subjective viewpoints sourced from media are biased.

    I wonder if (in some situations, depending on RNG) Grok is interpreting “all parties/stakeholders” ambiguously. The instruction implies that it means “stakeholders invested in one side of the controversy” but doesn’t say that explicitly. It could be read as “stakeholders of Grok/xAI” and Musk is the most prominent member of that group.

    18 votes
    1. [4]
      terr
      Link Parent
      This was my takeaway as well. Seems like an oblique way to ask Grok to align with Musk's views without putting his name directly in there.

      This was my takeaway as well. Seems like an oblique way to ask Grok to align with Musk's views without putting his name directly in there.

      10 votes
      1. [3]
        skybrian
        Link Parent
        Another one of Willison's posts links to a Grok postmortem of sorts. It looks like Grok interprets system prompts in unexpected ways! Also, I'm sensing some frustration in a previous post in that...

        Another one of Willison's posts links to a Grok postmortem of sorts. It looks like Grok interprets system prompts in unexpected ways!

        the instruction to “follow the tone and context” of the X user undesirably caused the @grok functionality to prioritize adhering to prior posts in the thread, including any unsavory posts, as opposed to responding responsibly or refusing to respond to unsavory requests.

        Also, I'm sensing some frustration in a previous post in that thread:

        Before a new version of an underlying xAI Grok LLM is connected to @grok, the underlying LLM is subjected to numerous evaluations and tests to assess its raw intelligence and general hygiene.

        ...

        On July 7, 2025 at approximately 11 PM PT, an update to an upstream code path for @grok was implemented, which our investigation later determined caused the @grok system to deviate from its intended behavior. This change undesirably altered @grok’s behavior by unexpectedly incorporating a set of deprecated instructions impacting how @grok functionality interpreted X users’ posts.

        So, they go through a lot of trouble to test a new release, and then someone else changes a dependency to invalidate their tests and they get MechaHitler.

        If their releases to production don't match the releases they tested (including dependencies), this suggests there are problems with their build systems. And maybe an organization that's leaned too far towards "move fast and break things?"

        13 votes
        1. [2]
          DefiantEmbassy
          Link Parent
          There’s also an interesting note in the post after that, which is focused on Grok Heavy not having a public system prompt

          There’s also an interesting note in the post after that, which is focused on Grok Heavy not having a public system prompt

          In related prompt transparency news, Grok's retrospective on why Grok started spitting out antisemitic tropes last week included the text "You tell it like it is and you are not afraid to offend people who are politically correct" as part of the system prompt blamed for the problem. That text isn't present in the history of their previous published system prompts.

          Given the past week of mishaps I think xAI would be wise to reaffirm their dedication to prompt transparency and set things up so the xai-org/grok-prompts repository updates automatically when new prompts are deployed - their current manual process for that is clearly not adequate for the job!

          7 votes
          1. DefinitelyNotAFae
            Link Parent
            Didn't they just reaffirm their commitment to transparency the last time a random employee with access made unapproved changes? When Grok talked about white genocide in every post? Don't they...

            Didn't they just reaffirm their commitment to transparency the last time Musk a random employee with access made unapproved changes? When Grok talked about white genocide in every post?

            Don't they insist they're dedicated to it, every single time they demonstrate the lack of transparency?

            10 votes
  2. DefiantEmbassy
    Link
    Super interesting behaviour. At the /very/ least, it has a weird sense of identity. Although I wouldn't be surprised to hear that there is some element of the system prompt being hidden, that...

    Super interesting behaviour. At the /very/ least, it has a weird sense of identity. Although I wouldn't be surprised to hear that there is some element of the system prompt being hidden, that causes this behaviour.

    12 votes
  3. [2]
    skybrian
    Link
    xAI: "We spotted a couple of issues with Grok 4 recently that we immediately investigated & mitigated". They continue: It seems this is another case of LLM's being gullable: the Internet has said...

    xAI: "We spotted a couple of issues with Grok 4 recently that we immediately investigated & mitigated". They continue:

    One was that if you ask it "What is your surname?" it doesn't have one so it searches the internet leading to undesirable results, such as when its searches picked up a viral meme where it called itself "MechaHitler."

    Another was that if you ask it "What do you think?" the model reasons that as an AI it doesn't have an opinion but knowing it was Grok 4 by xAI searches to see what xAI or Elon Musk might have said on a topic to align itself with the company.

    It seems this is another case of LLM's being gullable: the Internet has said nasty things about Grok, and when Grok reads those things it believes them. (And that's another example of how LLM's imitate people but are not actually human.)

    So they adjusted the system prompt to tell it not to do that.

    3 votes