39
votes
Grok searches for Elon Musk's opinion on controversial questions
Link information
This data is scraped automatically and may be incorrect.
- Title
- Grok: searching X for "from:elonmusk (Israel OR Palestine OR Hamas OR Gaza)"
- Authors
- Simon Willison
- Word count
- 672 words
The article says the system prompt includes this:
I wonder if (in some situations, depending on RNG) Grok is interpreting “all parties/stakeholders” ambiguously. The instruction implies that it means “stakeholders invested in one side of the controversy” but doesn’t say that explicitly. It could be read as “stakeholders of Grok/xAI” and Musk is the most prominent member of that group.
This was my takeaway as well. Seems like an oblique way to ask Grok to align with Musk's views without putting his name directly in there.
Another one of Willison's posts links to a Grok postmortem of sorts. It looks like Grok interprets system prompts in unexpected ways!
Also, I'm sensing some frustration in a previous post in that thread:
...
So, they go through a lot of trouble to test a new release, and then someone else changes a dependency to invalidate their tests and they get MechaHitler.
If their releases to production don't match the releases they tested (including dependencies), this suggests there are problems with their build systems. And maybe an organization that's leaned too far towards "move fast and break things?"
There’s also an interesting note in the post after that, which is focused on Grok Heavy not having a public system prompt
Didn't they just reaffirm their commitment to transparency the last time
Muska random employee with access made unapproved changes? When Grok talked about white genocide in every post?Don't they insist they're dedicated to it, every single time they demonstrate the lack of transparency?
Super interesting behaviour. At the /very/ least, it has a weird sense of identity. Although I wouldn't be surprised to hear that there is some element of the system prompt being hidden, that causes this behaviour.
xAI: "We spotted a couple of issues with Grok 4 recently that we immediately investigated & mitigated". They continue:
It seems this is another case of LLM's being gullable: the Internet has said nasty things about Grok, and when Grok reads those things it believes them. (And that's another example of how LLM's imitate people but are not actually human.)
So they adjusted the system prompt to tell it not to do that.
Ah well, that'll last until the next time.