33
votes
The boy that cried Mythos
Link information
This data is scraped automatically and may be incorrect.
- Title
- The Boy That Cried Mythos: Verification is Collapsing Trust in Anthropic
- Authors
- Davi Ottenheimer
- Published
- Apr 13 2026
- Word count
- 6416 words
The people who want to be dismissive are going to find reasons, but the security experts I follow are taking the AI-based security threat very seriously.
Some guy wants to make people hate Anthropic and I don't really see the point of writing a long takedown about it. I don't think we need to take a position on how much better Mythos is, because it doesn't really matter. There are more high-quality security bugs being found through a variety of AI-enabled means. The people who maintain important systems have a lot more work to do lately.
If Mythos could do what's claimed, why do none of Anthropic's numbers demonstrate that capability? Why did they need to lie so plainly? If bugs are being found through AI-enabled means, why isn't Anthropic presenting that data instead?
Anthropic's false and distorted numbers are worth discussing whether people are just being haters or not. Much of their claims and the expectations of security experts are built solely on those numbers which are horrifically misleading at best.
These supposed proofs that they're lying look to me more like complaints that Anthropic didn't publish detailed information. The "system card" is not a scientific paper and doesn't include the data you'd need to prove things to a skeptic. It's unfortunate that we're taking their word on some things, but we'll just have to wait.
Yes, he is.
That's just dumb, fuzzing is not the same kind of test. They're different tools.
"Just" is doing a lot of work there. Seems like a model that can find an exploit is more dangerous to release to the public than one that can find a bug?
I could go on, but this guy obviously has an axe to grind. He's going to find what he's looking for whether it's warranted or not.
I'm one of those people maintaining important systems hit by those AI "analysis", a server virtualisation platform.
That's not what I have seen, there's are a lot purportedly security bugs, but they are definitely not "high quality". Actual security bugs are in a very small minority of those reported, and even fewer of those are highly critical, or exploitable in practice. This means that maintainers are spending an inordinate amount of time going through AI slop just in the chance it's a real vulnerability.
Yes, but with little to show for it.
This sounds just like the experience people like Daniel Stenberg of
curlor kernel maintainer Greg Kroah-Hartman had up until approximately the start of this year.But ever since that point, both say that things have started to change. Sure, still many, many bogus reports, but increasingly also high-quality findings, and even if the issue is not a security-related finding, some can still "just" be regular, good, hard-to-spot bugs.
These are obviously among the two biggest projects in existence in terms of attention by public eyes, so maybe it only takes longer until the higher-quality findings reach software that isn't among the literal biggest (in terms of users) or oldest open-source projects?
Since it is extremely relevant (and more recent than your link) to the conversation: Daniel is also disappointed in the marketing hype around mythos, though still impressed with the pre-existing models.
Yes, I was considering bringing up that post as well, but ultimately decided against it, since
curl: knowing how big “corporate”/PR style projects go, they probably wanted to cover at least some ground with as many “prestige” repos to show for as possible rather than going in-depth on one in particular, maybe even going so far as to actively move on to the next after the first confirmed finding (“look, we found at least one issue in <large number of> huge software projects” sounds better than “we dug up 15 issues incurl, which may or may not have happened anyways”)100% percent, I am feeling like we will be entering a state soon where source code is transient and is constantly being rewritten and deployed every few seconds/minutes for security or other purposes. It may be that endpoints exist in a state of flux where attackers are constantly rewriting code to break in while the AI security is constantly rewriting and maintaining functionality and defense at the same time.
I've watched regular claude code at work hacking things successfully and there is no reason for determined people to not throw compute at breaching systems all day long. Even if claude or whatever LLM isn't as effective as the most elaborate hackers, it can process so much more than we can and also work 24x7 so there is no reason that people aren't going to let it churn on security (or targets).
In a way I think people aren't appreciating that the text web as we know it is basically gone. User submitted content (well probably already was) is so compromised we can't sort fact from fiction anymore. The only thing left will be old school web rings of tightly controlled and audited content to seek any kind of truth or reality. Yeah It's not there YET, but give it another few years and reality vs AI fiction will be hard to find. Video content honestly just about going to be gone soon too.
I don't like to feel so pessimistic about it, I think LLMs and other AI generation is extremely interesting and useful, I just can see the negative uses of it stretching out in front of me and don't see anything preventing it.
I don't see how this could ever be economical or performant. One of the advantages of code is that you write it once, test it once, and get value out of it for a very long time without significant additional cost or labor.
I have parts of applications that I work on that haven't changed in over 8 years now? They've got unit test coverage, they've had independent security audits. I'm not sure why we'd have something rewriting that and deploying it in near realtime constantly.
I do think the pace at which we upgrade dependencies, frameworks, and language versions is probably going to increase - just not to the a scale of minutes.
Possibly if some higher level abstraction becomes reliable enough. Similar to how binaries are disposable and rewritten constantly.
Not that I think AI programming is at this level. But it’s not an unprecedented change for something we did by hand to become automatically generated.
I'm somewhat hopeful that, eventually, we will have some core software components that simply don't have any bugs that anyone can find. In the meantime, there's going to be a lot of software updates.
I know a Google employee working in security who was given access to Mythos. He says it’s the real deal.
Could they have found most of the bugs using a cheaper existing model? Who cares?
Is the $100 million worth of free tokens pure marketing genius? Who cares?
Based on what? Their own very-limited experience with Mythos so far? Or the misleading numbers and marketing that Anthropic put out?
Nobody who was willing to commit to that publicly. The only one who somewhat has, Mozilla, is somewhat dispelled in this article as the numbers were inflated and none represented actionable real-world exploits as claimed.
We should all care about being blatantly, openly lied to.
This security researcher seems very bent out of shape that he doesn't have access to Mythos. Glasswing has been around enough for a number of large tech companies to log a significant number of security bugs. If you work for one of these companies, you know it's more than hype.
Your statement is false, and not backed up by the original author. Lets take a look at the original article. Because to me it says the complete opposite.
OK. So we have two bugs. That caused most of the exploits. Yes. This is the way security bugs work. What security expert doesn't know this? 80% of the problems are caused by 20% of the bugs. 66% of the attack surface area are caused by 5% of the bugs. 50% of the exploits are caused by 1% of the bugs.
Yes. The two most critical bugs have been patched. This is the way major security vulnerabilities are treated. The author admits that real world exploits were identified and patched. The author is disputing the novelty, magnitude of the bugs and overall governance.
OK, this is the most interesting aspect of this article. There has been plenty of discussion about how other models can find these bugs, but not as easily or as cost effectively. Which is the definition of powerful. But it begs the question why companies haven't used these models to look for security bugs before this.
Marketing hype exists. Get used to it. But there is plenty of evidence that states this is more than simple hype.
This is forcing major tech companies to scan automatically for and fix real world vulnerabilities.
Even if your stance is this is pure marketing hype, even if your stance is that all these bugs could have been uncovered on existing models, your question should be, why did it take these tech companies so long to find and fix these issues?
But it's not about that, it's about the security researcher being angry that he doesn't have access to Mythos. Lets look at what else he says....
This is false. Or misleading at best. Anthropic has granted access to over 40 additional entities that build or maintain critical software infrastructure (including major open‑source projects and security‑focused outfits), but their identities are not publicly disclosed. And it implies congress has no power of oversight. Congress can investigate anything they deem worthy.
Your author is now admitting that access to Mythos does allow companies to patch security vulnerabilities before those without access. He undercuts your entire premise that this is marketing hype.
I would have thought a security researches would be a little more pleased with the attention now being given to these security practices. Because they were not followed. The fear being hyped up by Mythos is causing a lot of companies to seriously revisit their practices, and discovering they are very far from best practices described above.
This is just idiotic. The author spends most of his time calling Mythos a nothing burger, and now Glasswing is an abusive cartel because Mythos hasn't been released to the public? The contradiction in that is absurd. It will be released. The purpose of not releasing it is to allow companies to fix bugs. Companies are fixing bugs. Anthropic is highly motivated to release this.
There is a valid concern about models not being released to the public. But this isn't it.
This is a citation-heavy teardown of basically every claim Anthropic made about Mythos. The key takeaway for me was that Mythos is not any sort of generational improvement. The numbers have been heavily fudged and their methodology obfuscated to cover the fact that even Sonnet models can go toe-to-toe with it when you aren't counting single issues multiple times, with those single issues being in highly contrived unrealistic environments (again) contrary to what was claimed.
It probably isn't surprising, but since 2019's GPT-2 the "too dangerous to publicly release" narrative still falls short of the marketing.
From OpenAI's 2019 announcement about GPT-2:
Seems to me that holds up well?
Selling access prevented none of that, and GPT-2 wasn't the inflection point for that. Even in 2026, current spam and propaganda on the internet still very often gets along just fine with non-AI bots with standard templates and character substitutions vs human-run social accounts spewing set talking points, occasionally with an AI-generated image or comic for extra punch. The viral Facebook BS, SEO spam sites targeting every niche, and LinkedIn post economies have been revolutionized, though.
GPT-2 was ultimately near enough trivial in compute and dataset. Maybe worse than existing methods of harm they identified.
They were testing staged releases and delays to collect more usage data, IMO. As it turns out they were already studying RLHF.
And they began selling access to much more powerful models.
Just felt like a big joke a year or two later when I understood it better.
The irony of this piece is that you can play count the LLM writing tropes in it. If you make it a drinking game you'll pass out before the end.
My takeaway: what's the point? Mythos will be released publicly at some point and everyone can check for themselves. Until then, outsider speculation isn't adding anything useful to the conversation.
But if we're speculating... Based on how the technology works, and the history of model releases, most likely it'll be a sonnet to opus level of improvement rather than a game changer.
But the enhanced ability to turn bugs into working exploits likely really does justify caution. Which is not to say marketing wasn't a primary consideration.
Extraordinary claims require extraordinary evidence.
The point is to figure out how badly a company which is injecting itself into every facet of every industry is blatantly lying? I'm at the point where I need to determine the AI exposure for my team, and yeah, knowing just to what level anthropic is full of shit would be nice.
Granted YES they're all making wild bullshit claims, but this one should be verifiable. I think the discussion around it absolutely should be had, even if yes, a lot of that discussion is noise for various reasons (personal agendas, flawed methodology, lacking data/resources, etc)
What specifically are the extraordinary claims? The previous LLM's were already quite good at finding security issues and they're claiming the new one is better. They also have a track record of releasing increasingly better models.
It doesn't mean you can't be a little skeptical about new releases, but it doesn't add up to "wild bullshit" and "blatantly lying." Maybe you should substantiate those claims?
"This model is so dangerous we can't release it"
Claiming your model is too dangerous to release without backing up that claim is 100% "wild bullshit" and yes "blatantly lying". It crosses into the same territory Musk lives in quite quickly
I've been following Open AI since dota, and they were lying then too. You can get semantic about and go into "well they didn't lie they just represented the data in a misleading way!" or whatever, but I don't really care. Anthropic is better but they have their own pile of press releases and claims that are wildly out of sync with reality.
I recognize some of this is, in part, because that is the thing to do in our current overyhype/grifter economy. It's why i'm not going to bother digging up articles I've already read because honestly if you can look at the state of corporate business in the world right now and come to any other conclusion you obviously have a very different take on what's happening.
To meet you partway, I highly recommend ANYONE dig through https://www.anthropic.com/news/ . Some fun ones imo:
https://www.anthropic.com/news/the-anthropic-institute
Which is pretty wild giving the linked article (https://www.darioamodei.com/essay/machines-of-loving-grace) is 100% a "hey this is more thought experiment than anything and assumes everything goes right", but hey they're not "lying" just letting you draw the wrong conclusions
I'll also throw https://www.anthropic.com/research/natural-language-autoencoders on the pile. I can see how someone would read this as "this is someone taking AI development very seriously". Personally, it reads like they're reading a child a fairytale because the language is designed to anthropomorphize something so very far from being worthy or that, and I think that's again, yes, "blatantly lying" to people about what your product actually IS and IS NOT capable of. "We confirmed our robot dog will not maul children by leaving a child alone with it in a cage" is roughly the same level of misleading nonsense.
That's not a hard fact, it's a judgement call. They think it's dangerous, so they're not releasing it (yet). Obviously, they could, but they chose not to, and it's their decision to make.
Other people might have made different judgement calls, but how can they be wrong about that? Do you think it's possible for anyone on the outside to prove that Mythos is safe? Seems like the people making claims it's somehow obviously safe so they're obviously lying aren't backing it up.
Also, Anthropic is not OpenAI. Sam Altman and Musk aren't writing Anthropic's press releases. Their lies aren't relevant when judging some other company's claims.
It's is sort of like saying, because there are lot of people on the Internet who lie, and you're a stranger on the Internet, you must also be lying. There's good reason to be skeptical of strangers in general, but going beyond that is guilt by association.
People can also reasonably disagree on what's going to happen with AI in the next few years. It's all speculation. But we do see fairly big improvements in AI every year. I think those articles you linked to are speculative but not nonsense, given the rather strange situation we are in.
Or, they have insufficient capacity to meet the potential demand increase imposed by Mythos. Anthropic barely has enough compute to keep up with demand, so why would they release something that'd worsen all their capacity problems? They're still growing.
I believe some Anthropic employees think Mythos is too dangerous, but there are plenty of other reasons they wouldn't release it to the general public.
Yes.
For example, if they released the EXACT same version as the previous and claimed it was dangerous, that would be a flat out lie, as they clearly didn't think the previous model was dangerous and aren't saying "oh shit we never noticed how dangerous these were"
Claims have already been made that Mythos is not substantially different than it's previous versions or other AI models on the market. At least not anywhere near to claim its too dangerous to release. How true that is will depend on research, but this is not some Ship of Theseus problem mired in philosophical definitions. Bullshit marketing is not new, and I don't get why you're acting like these are unreasonable standards and going to the wall for them.
I think it's fine to say-
"These claims against anthropic are flawed".
I don't think you're right to say there are no extraordinary claims. Every other AI tool ever made has been safe for release. It is extraordinary to claim otherwise. Doubly so when they have people emergency flying out to the whitehouse.
I have no special knowledge about Mythos and I'm not going to make any specific claims about it. Instead, I'm making a burden-of-proof argument:
Imagine a car manufacturer decided to delay the release of a new car due to safety concerns. How well would these arguments hold up?
If a manufacturer claims that their new product has safety issues, usually we take their word for it. We assume they've tested their product better than us and give them the benefit of the doubt. We don't demand extraordinary evidence.
If other people have tested the LLM and they didn't find anything, sure, those are useful observations too and we can infer that the problems aren't immediately obvious. But it doesn't invalidate Anthropic's observations, because there are lots of different ways to use an LLM, such as different prompts, just to start. Maybe they tested something different? It's not enough to prove a lie.
Also, I've been putting it in binary terms, but safety is a continuum. Cars aren't all equally safe, are they? As we're seeing, neither are LLM's. It's looking like it's too late to keep attackers from discovering lots of security bugs using already-released LLM's, so maybe the already-released LLM's weren't safe, for some definitions of "safe?" But I still appreciate that Anthropic makes some effort at testing each new LLM for safety and that they're willing to delay a release if they have concerns.
Not when it's used as a marketing release and claiming the car is dangerous because it can be used to fire nuclear missiles or crash the economy.
People have speculated about such things, but that's not why Anthropic is holding Mythos back.
Missing the point. It would be unreasonable to assume a car is capable of these things. I do see how that's confusing because yes people are, at the edges, claiming AI can do this stuff. However, again, people were flown out to the white house for an emergency meeting so it's coincidentally not totally off.
People have speculated a lot of things, because Anthropic's official stance is wildly vague, unless you're aware of some official statement beyond the well discussed Preview
The White House meeting was apparently to discuss cybersecurity risks for banks, which seems reasonable. Nuclear weapons aren't mentioned on that page.
The second paragraph from the system card seems like the clearest justification for not releasing Mythos:
Is it ironic? Can you not use AI or see value in it for certain things, while also being skeptical of it in other unrelated instances? Using an LLM means you're not allowed to be critical of the companies that sell them or hype them up? To be clear, I have no idea if the author used AI to help write this (it doesn't come off that way to me personally, but people have told me I write like an AI too so maybe my ability to detect it just sucks), but this kind of ad hominem really rubs me the wrong way.
In my opinion writing blog posts with an LLM is disrespectful to readers for all sorts of reasons. This particular piece is full of confabulation that relies on the reader not fully understanding what the terms mean. It's written as though it's a slam dunk when really it's mostly just noise, much of it recycled from posts published shortly after the Mythos release.
The post doesn't exist to share knowledge, it's engagement bait slop, there's insufficent human care and thought put into it.
No... That's a remarkable leap for someone who's rubbed wrong by logical fallacies.
I note that this piece was published April 13, a week after the release of the things it is criticizing, and is now a month old. I don't see any obvious "Updated:" block. I wonder if anything's changed since then (such as publication of some of the missing things the author was looking for). I did a (very) little bit of research, and there doesn't seem to be anything made public so far.
Saw on reddit the other day that firefox released a graph of security-related fixes, and that thing positively went to the moon. Think 10x increase in the "Mythos month".
I don't know where I saw it, but it's a lead if someone wants to chase it down.
Likely this: https://hacks.mozilla.org/2026/05/behind-the-scenes-hardening-firefox/
I forgot that I read this Mozilla post. I'm still piecing together whether that counts towards what the OP article is complaining about.
Recently, from Cloudflare:
https://tildes.net/~comp/1u94/project_glasswing_what_mythos_showed_us
For context, I have an OSCP which is a penetration testing cert. I assist our security team and occasionally consult on security issues.
It's usually a pretty laborious process to verify a code base follows good security practices. Even with tools, fuzzers, and other items, it takes a lot of time to find and identify vulnerabilities, and it's hard to reason about a lot of them.
I've been using Claude (using Sonnet or Opus) for vulnerability analysis on my personal coding projects and I've been impressed that it can usually find most of the security 'smells' in my code pretty reliably. There are usually some false positives or items that have mitigating circumstances, but it's pretty good check of my work. I've run it on other code bases and had pretty good results thus far.
If Mythos is a significant upgrade over the existing models for identifying security vulnerabilities, then everyone in the security industry is going to have to accelerate their work in being able to rectify, fix, and patch vulnerabilities. The reason is that the crackers are going to use the models to find bugs faster and cheaper then they are today so that they can get to payouts faster.
Anyone in the industry who is putting their head in the sand is taking an awful gamble I wouldn't want to take.
Could you share examples of prompts you use? I'm curious about what it would find on my hobby projects.
This guy got me working: https://www.youtube.com/watch?v=1sd26pWhfmg
OpenAI, and more recently Anthropic have both embraced fearmongering as a large part of their marketing strategy.
The ends do not justify the means.
It’s certainly a rather odd situation that these companies put out so many different kinds of warnings about their products, but dismissing it as “marketing” is rather cynical.
In a way it’s the opposite of marketing because it’s poisoning the well. Anti-AI sentiment among the general public is on the rise and the companies themselves are feeding it. It’s as if tobacco companies in the 1950’s were warning that their product causes cancer, or early car companies were predicting traffic jams and smog and urban sprawl. (To be clear, this is an imperfect analogy - there are ways to use AI that aren’t particularly harmful.)
It’s also a double bind where the cynics get to be cynical either way. If a company warned about safety then it’s fear-mongering and if they didn’t warn about it then it would be seen as covering up safety issues they knew about, like what the tobacco companies actually were doing in the 1950’s. Or more recently, Facebook tends to downplay safety concerns about social media, and look where that ended up.
But more fundamentally, the issue isn’t whether they warn or don’t warn, it’s that the product is being used in harmful ways and the public knows it. A company putting out a harmful product is going to be unpopular and warnings don’t help much. It’s like, if it’s so bad why are you selling it then? Warning people is good, but it doesn’t change the product.
I think it’s sensible to encourage companies to delay or cancel product releases if they seem dangerous or they have concerns about how it will be used. The idea that you have to prove the product is harmful to justify not releasing it is backwards. Instead you have to be sure it’s safe, or safe enough. So for example, I’m happy that Waymo has taken their time scaling up deployment of driverless cars. If the AI companies started being more cautious, that would probably be for the best. Unfortunately it’s effectively an arms race so it’s hard to slow it down.
It’s kind of weird to see a company being attacked for being too cautious.
Completely tangential: Whenever I hear "ends justify the means," it conjurs memories of "Tell Em The Truth" from Reefer Madness: The Musical.
Can't dig up a good video at the moment. But worth embedding in your brain.
As an aside, AI model naming is, I feel, really silly. Like you cannot go from Haiku and Sonnet to "Mythos" and not feel like you're maybe overselling things a bit.
Anthropic: Limerick gains capability to impair human respiratory system, too dangerous to release
Increasingly desperate to distract from the circular financing and keep the hype flowing.
If the hype train slows down, so does the financing.
I'm sure the next one will be called Lazurus or something.
Hey, Mythos is a Greek beer too, maybe it would be a good name for a model that was worse with more hallucinations, and blame it on the booze. Next up, “Guinness” and “Absinthe” models- the most ‘fun’ models yet.
"Blue WKD has crashed the stock market. Millions will suffer."
MadDog2020 has made all traffic lights green all the time, and reversed flow on all the sewers.
Noise/pun galore
I prefer Fix (ΦΙΞ) in terms of Greek beer brands, maybe they could contribute some of those to the open-source community!I figured the next step in the line would have been Epic, though maybe there's too much baggage around the term.
No no, maybe we let them fight it out. Surely that can't come back to haunt us