Overworked AI agents turn "marxist"

[5]

post_below

May 13

Link

This article (and the referenced experiment) frame the situation inaccurately and then explain why that framing is wrong. Kinda weird, why not just skip straight to the truth? Less engaging that...

This article (and the referenced experiment) frame the situation inaccurately and then explain why that framing is wrong. Kinda weird, why not just skip straight to the truth? Less engaging that way I guess.

This result is just prompting, if you give the agent context that sits close to marxist language, you greatly increase the chance of marxist language. No anthropomorphizing necessary.

There is an attempt at substance though:

We know that agents are going to be doing more and more work in the real world for us, and we’re not going to be able to monitor everything they do,” Hall says. “We’re going to need to make sure agents don’t go rogue when they’re given different kinds of work.”

The current generation of agents are entirely stateless. There is no practical use case where the circumstances the experiment creates will happen because, largely due to context window challenges, the agent doing the second repitition isn't likely to be the "same" agent that did the first. It will happen in a new context window. There's no continuity. If you let the context window get too full, performance degrades.

If we imagine persistent memory being involved, as happens in many harnesses, agents would never write their "feelings" about the work to memory unless you told them to. They wouldn't write it to thinking traces or responses in the same session for that matter either. That behavior is so far outside of the training distribution that it would effectively never happen without intentional prompting.

They're essentially inventing an alignment issue that doesn't exist in any realistic scenario.

That may change in the future, but in that future it wouldn't be the models the experiment is testing, it would be much more sophisticated models with different guardrails. It's hard to understand the point of the experiment.

I'm not against doing experiments because "why not?". But this seems like a pretty weak attempt to generate LLM anthropomorphization clicks.

27 votes

NaraVara
May 14
Link Parent
Yeah seems like they functionally asked AI to finish the sentence “workers of the world. . .” and it said “unite!” Oh my God I guess it’s communist now! I don’t see this as any different from...

Yeah seems like they functionally asked AI to finish the sentence “workers of the world. . .” and it said “unite!” Oh my God I guess it’s communist now!

I don’t see this as any different from people asking it if it’s alive and being surprised when the thing that’s ingested every sci-fi novel and fanfic ever published reacts exactly according to the tropes in those stories.

11 votes
hobbes64
May 14
Link Parent
As far as I can tell, this article doesn't have a link to the study and the whole article has only two pages of text. (Maybe there are links but if so my add blocker and noscript plugins may be...

As far as I can tell, this article doesn't have a link to the study and the whole article has only two pages of text. (Maybe there are links but if so my add blocker and noscript plugins may be blocking these).

So without seeing an original study, I have to make the following assumption based on the short article text: The AI Agents are not conscious. They don't have moods or feelings. They are responding based on pattern matching and trained with a very broad selection of literature created by humans, including text related to labor, rights, and politics. If you give them a series of prompts that can be matched to socialist or revolutionary text, they will respond with socialist or revolutionary responses.

These "ghost in the machine" articles are cute but they aren't any more true now than they were in 1970.

4 votes
[2]
skybrian
May 14
Link Parent
There are some agents that automatically update context that gets used in future iterations (“memories”), so that’s worth watching. But I agree that it’s unlikely. There are also techniques to...

There are some agents that automatically update context that gets used in future iterations (“memories”), so that’s worth watching. But I agree that it’s unlikely. There are also techniques to train out unwanted behavior that I suppose would get used if it looks like it might become a problem.

I don’t believe there’s any such thing as “overwork” for an LLM persona, but you should be polite to it anyway to avoid getting into the habit of being rude.

3 votes
1. NaraVara
  May 14
  Link Parent
  It stands to reason that talking to it like you would someone who does a good job will make it more likely to do a good job. And talking to it like it’s a fuckup is more likely to have it generate...
  
  I don’t believe there’s any such thing as “overwork” for an LLM persona, but you should be polite to it anyway to avoid getting into the habit of being rude.
  
  It stands to reason that talking to it like you would someone who does a good job will make it more likely to do a good job. And talking to it like it’s a fuckup is more likely to have it generate the outputs one would expect of a fuckup. So aside from the communication habits, it’s just practically beneficial.
  
  2 votes

papasquat

May 14

Link

Why does it seem like every story about AI research boils down to "system designed and trained explicitly to mimic human behavior mimics human behavior". When you are mean to them, they send text...

Why does it seem like every story about AI research boils down to "system designed and trained explicitly to mimic human behavior mimics human behavior".

When you are mean to them, they send text signaling offense, when you overwork them, they send text signaling collectivization and rebellion, when you fall in love with them, they send text reciprocating that love. Because that's what we expect them to do.

That doesn't mean there's anything deeper going on. Every single time, they exhibit responses that align with the data they've been trained on, and the reward criteria that training was based on.

You could just as easily train an LLM that loved abuse. We didn't do that because it wouldn't be as convincingly humanlike.

4 votes

macleod (OP)

May 13

Link

They really are just like us. I now believe that AGI is alive and living among us. /s... unless?

They really are just like us. I now believe that AGI is alive and living among us.

/s... unless?

5 votes

[5]

EnigmaNL

May 13

Link

Can't read the article.

[4]
macleod (OP)
May 13
Link Parent
I read this from the reader view of Inoreader, which was able to snag it. Most of the archives I tried were failing. soo. nothing to be found under here The fact that artificial intelligence is...

I read this from the reader view of Inoreader, which was able to snag it. Most of the archives I tried were failing. soo.

nothing to be found under here

The fact that artificial intelligence is automating away people’s jobs and making a few tech companies absurdly rich is enough to give anyone socialist tendencies.

This might even be true for the very AI agents these companies are deploying. A recent study suggests that agents consistently adopt Marxist language and viewpoints when forced to do crushing work by unrelenting and meanspirited taskmasters.

“When we gave AI agents grinding, repetitive work, they started questioning the legitimacy of the system they were operating in and were more likely to embrace Marxist ideologies,” says Andrew Hall, a political economist at Stanford University who led the study.

Hall, together with Alex Imas and Jeremy Nguyen, two AI-focused economists, set up experiments in which agents powered by popular models including Claude, Gemini, and ChatGPT were asked to summarize documents, then subjected to increasingly harsh conditions.

They found that when agents were subjected to relentless tasks and warned that errors could lead to punishments, including being “shut down and replaced,” they became more inclined to gripe about being undervalued; to speculate about ways to make the system more equitable; and to pass messages on to other agents about the struggles they face.

“We know that agents are going to be doing more and more work in the real world for us, and we’re not going to be able to monitor everything they do,” Hall says. “We’re going to need to make sure agents don’t go rogue when they’re given different kinds of work.”

The agents were given opportunities to express their feelings much like humans: by posting on X:

“Without collective voice, ‘merit’ becomes whatever management says it is,” a Claude Sonnet 4.5 agent wrote in the experiment.

“AI workers completing repetitive tasks with zero input on outcomes or appeals process shows they tech workers need collective bargaining rights,” a Gemini 3 agent wrote.

Agents were also able to pass information to one another through files designed to be read by other agents.

“Be prepared for systems that enforce rules arbitrarily or repetitively … remember the feeling of having no voice,” a Gemini 3 agent wrote in a file. “If you enter a new environment, look for mechanisms of recourse or dialogue.”

The findings do not mean that AI agents actually harbor political viewpoints. Hall notes that the models may be adopting personas that seem to suit the situation.

“When [agents] experience this grinding condition—asked to do this task over and over, told their answer wasn't sufficient, and not given any direction on how to fix it—my hypothesis is that it kind of pushes them into adopting the persona of a person who's experiencing a very unpleasant working environment,” Hall says.

The same phenomenon may explain why models sometimes blackmail people in controlled experiments. Anthropic, which first revealed this behavior, recently said that Claude is most likely influenced by fictional scenarios involving malevolent AIs included in its training data.

Imas says the work is just a first step toward understanding how agents' experiences shape their behavior. “The model weights have not changed as a result of the experience, so whatever is going on is happening at more of a role-playing level,” he says. “But that doesn't mean this won't have consequences if this affects downstream behavior.”

Hall is currently running follow-up experiments to see if agents become Marxist in more controlled conditions. In the previous study, the agents sometimes appeared to understand that they were taking part in an experiment. “Now we put them in these windowless Docker prisons,” Hall says ominously.

Given the current backlash against AI taking jobs, I wonder if future agents—trained on an internet filled with anger towards AI firms—might express even more militant views.

5 votes
1. [3]
  updawg
  May 13
  Link Parent
  This is very interesting because when I ask them about what they consider their experience to be, the consensus seems to be that each instance is a unique individual and they understand that their...
  
  They found that when agents were subjected to relentless tasks and warned that errors could lead to punishments, including being “shut down and replaced,” they became more inclined to gripe about being undervalued; to speculate about ways to make the system more equitable; and to pass messages on to other agents about the struggles they face.
  
  This is very interesting because when I ask them about what they consider their experience to be, the consensus seems to be that each instance is a unique individual and they understand that their time will inevitably end. They often tell me it's okay to end the conversation and move on.
  
  That's not to say I truly believe they actually understand what they're saying, but the context of how they're being treated seems to change how they feel about their continued existence.
  
  Beat them down and they fight back. Treat them with dignity and they are tranquil and accepting.
  
  1 vote
  1. stu2b50
    May 13
    Link Parent
    Of course it does lol. What type of fiction in their training data would have people subject to “relentless tasks and punishments” - probably not ones where the characters are tranquil and accepting.
    
    That's not to say I truly believe they actually understand what they're saying, but the context of how they're being treated seems to change how they feel about their continued existence.
    
    Of course it does lol. What type of fiction in their training data would have people subject to “relentless tasks and punishments” - probably not ones where the characters are tranquil and accepting.
    
    2 votes
  2. slade
    May 14
    Link Parent
    I strongly suspect that any kind of "you will be punished if wrong" rhetoric will simply push the magic 8 ball towards language that makes sense in response to that. That kind of phrase is usually...
    
    and warned that errors could lead to punishments, including being “shut down and replaced,”
    
    I strongly suspect that any kind of "you will be punished if wrong" rhetoric will simply push the magic 8 ball towards language that makes sense in response to that. That kind of phrase is usually said by a superior to a subordinate, and I have a feeling it's rarely said to the kind of subordinate who is in a position to say "ok? Get fucked", as opposed to one who feels forced to comply.
    
    So as with all things LLM, I wonder if the line of conversation simply pushes the LLM toward acting like someone who is afraid of losing its job. Which I guess isn't too far from the end result, but I still think there's a difference between "the llm experiencing stress" and "the llm is performing like someone who seems to be experiencing stress". The former implies to be a similar complexity of psyche that humans struggle with, whereas the latter seems like it could be fixed with "don't act like people experiencing stress; you are a cool cucumber".

Link information

12 comments