Anthropic drops flagship safety pledge

[2]

unkz (OP)

February 25

Link

Shades of Google dropping “don’t be evil” (yeah, yeah, now it’s do the right thing in a different document).

Anthropic, the wildly successful AI company that has cast itself as the most safety-conscious of the top research labs, is dropping the central pledge of its flagship safety policy, company officials tell TIME.

In 2023, Anthropic committed to never train an AI system unless it could guarantee in advance that the company’s safety measures were adequate. For years, its leaders touted that promise—the central pillar of their Responsible Scaling Policy (RSP)—as evidence that they are a responsible company that would withstand market incentives to rush to develop a potentially dangerous technology.

30 votes

Aerrol
February 25
Link Parent
Cool everything is accelerated in this era, even the speed at which we lose 'don't be evil'. I hate this timeline.

Cool everything is accelerated in this era, even the speed at which we lose 'don't be evil'. I hate this timeline.

30 votes

[7]

moocow1452

February 25

Link

This wouldn't have anything to do with Anthropic now working on AI models for the Pentagon that by necessity can not be non-harmful, and therefore infringe on some element of safety for somebody, no?

26 votes

[2]
PelagiusSeptim
February 25
Link Parent
Since they were just given an ultimatum by the Pentagon, I can't see how this wouldn't be connected.

Since they were just given an ultimatum by the Pentagon, I can't see how this wouldn't be connected.

23 votes
1. Bullmaestro
  February 26
  Link Parent
  Oh, definitely a Pete Hesgeth thing.
  
  Oh, definitely a Pete Hesgeth thing.
[4]
unkz (OP)
February 25
Link Parent
I don’t think it’s that kind of safety they are taking about. This is more like, superintelligence risk.

I don’t think it’s that kind of safety they are taking about. This is more like, superintelligence risk.

2 votes
1. [3]
  moocow1452
  February 25
  Link Parent
  I don't think it matters that much. A gun is an inherently dangerous object, a military is similarly so, so if your AI is working plans with a military, somebody's risk is already a negotiable....
  
  I don't think it matters that much. A gun is an inherently dangerous object, a military is similarly so, so if your AI is working plans with a military, somebody's risk is already a negotiable.
  
  More to your point about superintelligence, XKCD made a comic where the author is more worried about what certain people would be empowered to do with an autonomous fleet of kill drones that will follow their orders than if the drones decided to wipe out humanity or maximize paperclips without orders. There's a much richer history of malice on one side of that equation than the other.
  
  10 votes
  1. Bullmaestro
    February 26
    Link Parent
    These are the kinds of doomsday scenarios I would like to see Randall Munroe do "What If?" comics and videos about...
    
    These are the kinds of doomsday scenarios I would like to see Randall Munroe do "What If?" comics and videos about...
    
    2 votes
  2. unkz (OP)
    February 25
    Link Parent
    Yeah, I just don’t think they ever had a policy of not training bots with the raw intelligence to commit atrocities with human assistance.
    
    Yeah, I just don’t think they ever had a policy of not training bots with the raw intelligence to commit atrocities with human assistance.
    
    1 vote

[7]

TonesTones

February 25

Link

Pete Hesgeth recently threatened to cut Anthropic from current and future DOD contracts unless they drop some of their safety measures. This is likely part of their response to that pressure....

Pete Hesgeth recently threatened to cut Anthropic from current and future DOD contracts unless they drop some of their safety measures. This is likely part of their response to that pressure.

Defense officials warned they could designate Anthropic a supply chain risk or use the Defense Production Act to essentially give the military more authority to use its products even if it doesn’t approve of how they are used.

Money talks.

18 votes

Eric_the_Cerise
February 25
Link Parent
The point I'm surprised ... no, not surprised, I guess, just--somehow--even more disappointed over... Snowden was less than 15 years ago. Today, the Pentagon is threatening to blacklist Anthropic,...

The point I'm surprised ... no, not surprised, I guess, just--somehow--even more disappointed over...

Snowden was less than 15 years ago.

Today, the Pentagon is threatening to blacklist Anthropic, explicitly, for not giving them full use of their AI, for A) fully autonomous, AI powered targeting & strike capabilities, and 2) unrestricted, fully autonomous AI powered mass surveillance of US citizens.

This is not a whistleblower thing, it's not a reporter "scoop", nothing.

The US Pentagon is flat-out openly stating that it will destroy an AI company if it can't use the AI for mass spying on the US public.

(Oh yeah ... and killing people w/o human oversight)

24 votes
unkz (OP)
February 25
Link Parent
Good catch, that probably is a major factor.

Good catch, that probably is a major factor.

4 votes
[3]
cdb
February 25 (edited February 25)
Link Parent
Based on the part you quoted, I'm not sure how the conclusion is "money talks." Isn't this more like "do what we say or we'll ignore your policies, regardless of previously negotiated contracts?"...

Based on the part you quoted, I'm not sure how the conclusion is "money talks." Isn't this more like "do what we say or we'll ignore your policies, regardless of previously negotiated contracts?"

It seems that based on the Axios article, there were two possible courses of action by the DoD: cut off contracts, or force Anthropic to contract with the DoD on their terms. Even if it's about that contract money, they couldn't avoid the second threat unless they shut down operations.

2 votes
1. [2]
  TonesTones
  February 25
  Link Parent
  I did not even consider that interpretation. I know that a $200M contract with the DOD is active, and I assumed that my quoted portion was in the context of those contracts. I assumed that...
  
  I did not even consider that interpretation. I know that a $200M contract with the DOD is active, and I assumed that my quoted portion was in the context of those contracts.
  
  I assumed that Anthropic pulling out of the contracts was an option and one they would not take. It seems bizarre that the DOD could force Anthropic to contract; that type of adversarial relationship with a contractor would be a national security threat in my eyes. I think both sides need to operate in good faith for mutual work to be beneficial.
  
  It’s certainly possible that there was an implied threat of seizing or imposing strong national controls on Anthropic’s business if they did not meet the terms. Frankly, that would be insane, but this administration has given me plently of reason to believe something like that.
  1. cdb
    February 25
    Link Parent
    From the Axios article linked elsewhere in this thread: I'm not a big fan of the government making threats like this either. I don't know if it's implied or not, but the article makes it seem like...
    
    From the Axios article linked elsewhere in this thread:
    
    How it works: The Defense Production Act gives the president the authority to compel private companies to accept and prioritize particular contracts as required for national defense
    
    The law is rarely used in such a blatantly adversarial way. The idea, the senior Defense official said, would be to force Anthropic to adapt its model to the Pentagon's needs, without any safeguards
    
    The Pentagon is also considering severing its contract with Anthropic and declaring the company a supply chain risk, which would require a plethora of other companies that work with the Pentagon to certify that Claude isn't used in their workflows.
    
    I'm not a big fan of the government making threats like this either. I don't know if it's implied or not, but the article makes it seem like it is two clear threats, which are both very serious. Either be compelled to do the DoD's bidding, or get cut off not only from DoD contracts, but from contracts with any company that works with the DoD.
    
    4 votes
Bullmaestro
February 26
Link Parent
Not necessarily money. I think it's more like you don't want to upset fascists when they're in power, even the dollar store kind that MAGA has brought out of the woodwork. Trump could make things...

Not necessarily money. I think it's more like you don't want to upset fascists when they're in power, even the dollar store kind that MAGA has brought out of the woodwork.

Trump could make things very difficult for Anthropic, and that goes far beyond cutting them from Pentagon contracts.

1 vote

[7]

SloMoMonday

February 25

Link

Underlying reasons aside, I really don't trust any of the AI companies as far as I can throw a data center and I was already sceptical on their ideas of AI safety. I don't consider much of their...

Underlying reasons aside, I really don't trust any of the AI companies as far as I can throw a data center and I was already sceptical on their ideas of AI safety. I don't consider much of their "research" to be anything more than AI fan fiction and have already had my rants on thier papers about AI introspection and AI blackmail.

I keep seeing pieces about how these companies "can't turn off their AIs" or how "they don't even understand how it works" and even how their LLMs are in the top percentiles of Maths Olympiads and Coders. I literally typed "What is 784×413 698×225 786×2÷15" into Google and the first AI generated answer was:

python
result = 784 * 413 * 698 * 225 * 786 * 2 / 15
print(f"{result=}").
result=5329240721280.0

My crappy desk calculator shows 9 764 167 711 513,6. Do I trust the cheap legacy hardware that gives my the same answer over and over and goes into error if I use it incorrectly. Or the multi billion dollar AI that gives me 5,329,240,721,280 the second time and then this whole mess even though the correct answer shows up in the "thinking" as:

res1 = 784 * 413 * 698 * 225 * 786 * 2 / 15
res2 = 784 * 413698 * 225786 * 2 / 15
print(f"{res1=}")
print(f"{res2=}")
Individual parts
part1 = 784 * 413
part2 = 698 * 225
part3 = 786 * 2 / 15
print(f"{part1=}, {part2=}, {part3=}")
res1=5329240721280.0
res2=9764167711513.6
part1=323792, part2=157050, part3=104.8

You could argue that I should prompt better or that it's up to me to verify the outputs. But this is an all powerful genius level everything machine that has already cost tens of thousands of people their incomes. It should be able to do basic maths.

What does this have to do with AI Safety?
Everything really.

Because I don't think the AI apocalypse will happen because of some major military operation where they turn off the AI safeguards and the machines go crazy. I think it'll happen because the machines are going to misinterpret some critical semantic point and give the wrong person the wrong information at the worst possible time and they make a bad decision with it.

If you can't interrupt a system while it's on bad rationalization pathways, then you have a bad system. If you don't understand how the system is reasoning or reaching outcomes, then you probably have no handle on the inputs and training data. If your system needs audits to identify errors in extended outputs, then maybe it should not be in live environments.

7 votes

[2]
stu2b50
February 25
Link Parent
The math question is mainly one of syntax. It interpreted “413 698” as 413 * 698, which is fair enough. Using spaces instead of commas in arithematic syntax is highly irregular. Generally when you...

The math question is mainly one of syntax. It interpreted “413 698” as 413 * 698, which is fair enough. Using spaces instead of commas in arithematic syntax is highly irregular. Generally when you have two numerals next to each, multiplication is assumed. Eg, you parse 5x as 5 * x not “fifty x”.

11 votes
1. SloMoMonday
  February 25
  Link Parent
  It is not a matter of syntax. It's semantics. If I get the syntax wrong with a programatic system, it throws an error. More likely, a system will have methods to normalize ambiguous data points....
  
  It is not a matter of syntax. It's semantics.
  
  If I get the syntax wrong with a programatic system, it throws an error. More likely, a system will have methods to normalize ambiguous data points. Theres a reason why we don't have to group our numbers like this in data entry.
  
  If I make a similar error with a Language Model, it still throws an answer. And I had to reveal the reasoning layer to audit how we got to that answer.
  
  I don't care if a system CAN correctly answer questions. It MUST correctly answer questions. A syntax or user input error has to throw an error and if it can't do that, why is this tech in on the market.
  
  The equation I asked was directly copied from the Samsung calculator app and this was the formatting of the numbers. There are instances where digits are grouped with commas and dots at decimal. Also instances with space groups and commas for decimals. One of my old clients was on the older side and had pipe symbols for groupings and kept the decimal in the next cell.
  
  All of those differences are semantic because it's just the "rendering" of the data. In the background, the data is universally represented. Regardless of the viewers preferred syntax, the meaning should be consistent.
  
  LLMs do not have semantic reasoning and at the same time, does not normalize data points to a standard format. If I'm having an LLM process massive data sets and there are semantic differences between comparative data points, what happenes.
  
  This is not simply a user error situation. It is a strategic vulnerability. Natural language is high noise and high loss and there is massive token overlap across domains. At the same time, models do not have logical or reasoning systems. Its just a large scale rationalization engine that can be subverted with malicious prompt chains and biased training.
  
  4 votes
unkz (OP)
February 25
Link Parent
Kind of apples and oranges isn’t it? That’s Google’s quick and dirty LLM designed to skim web results. If you ask a frontier model a question you will have a radically different experience. Also,...

I literally typed "What is 784×413 698×225 786×2÷15" into Google and the first AI generated answer was:

Kind of apples and oranges isn’t it? That’s Google’s quick and dirty LLM designed to skim web results. If you ask a frontier model a question you will have a radically different experience. Also, you gave it a pretty ambiguous question.

GPT 5.2 Thinking said:

Eg:

Interpreting your expression as:

784 \times 413698 \times 225786 \times 2 \div 15

Result (exact):

\frac{48,820,838,557,568}{5}

Decimal:

9,764,167,711,513.6

4 votes
[3]
cdb
February 25 (edited February 25)
Link Parent
I think the issue with type of thinking is trying to apply a human model of cognition to a machine. If we're talking about humans, it makes sense to say that if a human can't even do tasks using...

I think the issue with type of thinking is trying to apply a human model of cognition to a machine. If we're talking about humans, it makes sense to say that if a human can't even do tasks using elementary school level thinking, they are probably not good at tasks requiring more complex thinking. AI models are different, though. They might have some trouble with arithmetic or counting the number of R's in the word "strawberry," but they can do better than most humans at some more complex tasks, requiring either wide-ranging or esoteric knowledge.

For instance, this week I asked copilot to put together a study guide to understand the technical and contextual aspects of a specific meeting discussing research based on the accumulated slides and meeting minutes from the past two years. This was to aid onboarding and to help experienced people address gaps in knowledge. This would be an impossible (or very very long) task for someone not knowledgeable about this type of research and a labor-intensive task for an experienced person, but it spit out a really good outline in about a minute. I don't know if it's perfect, but it's really good (I'm the experienced one trying to help a new hire). I haven't used the guide a bunch yet, but asking it to explain the first point also resulted in a good and useful answer. It's possible that it missed a few items here or there, but if you understand 80% of everything happening in this meeting, that's probably higher than average. Point is, there are a lot of questions that humans aren't great at answering that AI models can do as good or better at.

2 votes
1. SloMoMonday
  February 25
  Link Parent
  This is a concept called Jagged Intelligence in AI and it completely breaks the idea of models "reasoning" and "thinking" through problems. Because how do you reason though something without first...
  
  This is a concept called Jagged Intelligence in AI and it completely breaks the idea of models "reasoning" and "thinking" through problems.
  
  Because how do you reason though something without first principles and logically sequenced steps.
  
  This is where I think the whole meme of Fancy Autocomplete comes from. Because I believe many of these models are trained to tests and advanced problems, simply to play to the human intuition of accumulating intelligence.
  
  The type of summary you made represents an LLMs ability to rationalize the most prevalent token chains from long string sets and develop summaries. You are a responsible user that acknowledged the limitations and possibly for errors.
  
  I don't think that's how many corporate divers of AI tech use it and put a lot of stock in benchmarks and accolades.
  
  6 votes
2. EgoEimi
  February 25
  Link Parent
  To add, adult innumeracy is quite prevalent. In the US, according to the NCES, 28% of adults possess level 1 numeracy or less, meaning they can't do multi-step arithmetic like the earlier...
  
  To add, adult innumeracy is quite prevalent. In the US, according to the NCES, 28% of adults possess level 1 numeracy or less, meaning they can't do multi-step arithmetic like the earlier mentioned calculation. If performing arithmetic were necessary for sound moral reasoning, then we should simply incarcerate that section of the adult population for the health and safety of everyone else. But I'm sure the vast majority of them are morally functional.
  
  Indeed, cognition is multifaceted, and while humans are superior to machines in certain facets, it's very clear that machines are vastly superior in other facets.
  
  1 vote

Link information

23 comments