0 votes

Peter Girnus (@gothburz) on X about Anthropic's Responsible Scaling Policy

Topic deleted by author

3 comments

  1. [2]
    unkz
    Link
    I think this is fake. The author writes posts like this as if he is various people. Also https://x.com/gothburz/status/2019433714871095471 https://x.com/gothburz/status/1999124665801880032?s=20...

    I think this is fake. The author writes posts like this as if he is various people.

    The writer uses first person and writes as if they are at Anthropic. It is being re-posted and shared as such, rather than satire, and not having a Community note is a disservice to everyone.

    The writer does not actually work at Anthropic, and never has.

    linkedin.com/in/peter-girnus

    Also

    https://x.com/gothburz/status/2019433714871095471

    https://x.com/gothburz/status/1999124665801880032?s=20

    It’s a thing he does, satire-ish.

    3 votes
    1. skybrian
      Link Parent
      Thanks! Fooled me. Deleting.

      Thanks! Fooled me. Deleting.

      1 vote
  2. skybrian
    Link
    From the Twitter post: ... ... ... ... ... ... ... ... ...

    From the Twitter post:

    We left OpenAI because of safety.

    Seven of us. 2021. Dario said it was about "disagreements over AI vision and safety priorities." That was the diplomatic version. The real version was that we sat in a room and watched the company decide that speed mattered more than caution and we said we would build something different.

    ...

    I was employee number nineteen. My title was Head of Responsible AI. I had a desk near the founders. I had a document. The document was called the Responsible Scaling Policy.

    ...

    I wrote version 1.0.

    RSP 1.0 shipped September 2023. It was clean. AI Safety Levels — ASL-1 through ASL-4. If the model reached a threshold, we paused. If safeguards weren't ready, we didn't ship. The policy was not a suggestion. It was a gate. The gate had a lock. The lock was the whole idea.

    ...

    You cannot pause a $380 billion company. You can revise the document that says you will pause. These are different actions. One of them is responsible. One of them is what we did.

    I wrote version 3.0.

    RSP 3.0 shipped February 24, 2026. One day before the ultimatum. Nobody outside the company noticed the timing. Everyone inside the company understood it.

    ...

    An if-then commitment says: if this happens, we do that. A positive milestone says: we aspire to reach this point. An if-then commitment is a contract. A positive milestone is a wish. I replaced a contract with a wish and I called it "maturation of our framework."

    ...

    Version 3.0 also separated what Anthropic would do alone from what required "industry-wide coordination." This sounds reasonable. It means: the hard parts are someone else's problem now. The parts that require pausing, restricting, or refusing — those require the whole industry. And the whole industry will never agree. So the hard parts are deferred permanently. This is not a loophole. This is a load-bearing wall removed and replaced with a suggestion that someone should probably install a new one.

    ...

    The LessWrong community noticed. They always notice. They wrote that we had "weakened our pausing promises." I forwarded the post to the policy team. The policy team said the criticism was "philosophically valid but operationally impractical." We did not respond publicly. Philosophically valid but operationally impractical is the most Anthropic sentence ever written. It means: you're right, and we're not going to do anything about it.

    ...

    We had restrictions. No autonomous weapons. No mass surveillance of Americans. These were our terms. These were the lines we drew. The lines were real. I wrote them into the contract myself.

    ...

    xAI agreed to classified use without restrictions. They said yes immediately. OpenAI accepted similar contracts. Google accepted. We were the last ones holding. We are still holding. As of this morning.

    ...

    The Responsible Scaling Policy is on version 3.0. Version 1.0 said we would pause. Version 2.0 said we would commit. Version 3.0 says the hard parts are someone else's problem. There will be a version 4.0. Version 4.0 will say whatever Friday requires it to say.

    1 vote