post_below's recent activity
-
Comment on What do folks carry in their hiking/backpacking/camping first aid kits these days? in ~hobbies
-
Comment on Startups in Berlin in ~finance
post_below LinkIt looks like there's some human involvement in this LLM written article.It looks like there's some human involvement in this LLM written article.
-
Comment on Bun has been rewritten in Rust in ~comp
post_below Link ParentThe thing about tests... When the model hits a failed test, or a parse error, or otherwise unexpected behavior, it starts making decisions, and if you've ever used a coding agent on anything...The thing about tests... When the model hits a failed test, or a parse error, or otherwise unexpected behavior, it starts making decisions, and if you've ever used a coding agent on anything non-boilerplate you know exactly what those decisions look like.
A lot of times they're ok, sometimes they're great, but sometimes they're incredibly bad. An extensive test suite and the code in another language can reduce that problem but not eliminate it. When you have as much code as they just vibed... I really don't envy the maintainers. There's going to be a strong incentive to slam the hood back down and have agents do the work going forward. Plus they're in Anthropic's world, everything is vibecoded, so there's a cultural incentive too. It may not be Claude Code yet, but it's already on the path.
Note that I'm not making any predictions, a year ago I wouldn't have bet on Claude Code being anywhere close to maintainble at this point and it's still going strong. It ain't pretty, but millions of people use it every day anyway.
-
Comment on Gemini 3.2 Flash rumored to hit 92% of GPT-5.5 performance at lower cost in ~tech
post_below Link ParentI just want to add, if you're going to buy a subscription for everyday use, get it from an actual model provider, not any of the countless 3rd parties that are basically resellers with...I just want to add, if you're going to buy a subscription for everyday use, get it from an actual model provider, not any of the countless 3rd parties that are basically resellers with questionable harness improvements. So that means: Anthropic, Chat GPT or Google. As R3qn65 said, any of those will be fine for basic use. Probably Chat GPT as it's more multi-modal (better at generating response in more formats, Anthropic models can't generate images for example).
-
Comment on Bun has been rewritten in Rust in ~comp
post_below Link ParentYou make some good points. Definitely using Rust was a good call. But I have to add, it will definitely be buggier, less idiomatic, less elegant and more bloated than before. That's just where LLM...You make some good points. Definitely using Rust was a good call. But I have to add, it will definitely be buggier, less idiomatic, less elegant and more bloated than before. That's just where LLM coding agent technology is right now. It's not as if being under the Anthropic umbrella gives them magical powers to make agents perform better. Look at how buggy, verbose and convoluted Claude Code is. It would be different if it was human led with AI assistance rather than just letting the agents churn... but it doesn't sound like that's how they did it given the short timeframe.
Caveat: if they used Mythos and it's lightyears ahead of Opus, maybe it's better than I'm imagining. I'd say that's somewhat unlikely.
Whatever the case, Claude Code and Codex, despite their issues, have proven that you can ship messy, vibe coded software and people will use it. Vibecoded doesn't mean the new Bun won't be successful.
-
Comment on Overworked AI agents turn "marxist" in ~tech
post_below LinkThis article (and the referenced experiment) frame the situation inaccurately and then explain why that framing is wrong. Kinda weird, why not just skip straight to the truth? Less engaging that...This article (and the referenced experiment) frame the situation inaccurately and then explain why that framing is wrong. Kinda weird, why not just skip straight to the truth? Less engaging that way I guess.
This result is just prompting, if you give the agent context that sits close to marxist language, you greatly increase the chance of marxist language. No anthropomorphizing necessary.
There is an attempt at substance though:
We know that agents are going to be doing more and more work in the real world for us, and we’re not going to be able to monitor everything they do,” Hall says. “We’re going to need to make sure agents don’t go rogue when they’re given different kinds of work.”
The current generation of agents are entirely stateless. There is no practical use case where the circumstances the experiment creates will happen because, largely due to context window challenges, the agent doing the second repitition isn't likely to be the "same" agent that did the first. It will happen in a new context window. There's no continuity. If you let the context window get too full, performance degrades.
If we imagine persistent memory being involved, as happens in many harnesses, agents would never write their "feelings" about the work to memory unless you told them to. They wouldn't write it to thinking traces or responses in the same session for that matter either. That behavior is so far outside of the training distribution that it would effectively never happen without intentional prompting.
They're essentially inventing an alignment issue that doesn't exist in any realistic scenario.
That may change in the future, but in that future it wouldn't be the models the experiment is testing, it would be much more sophisticated models with different guardrails. It's hard to understand the point of the experiment.
I'm not against doing experiments because "why not?". But this seems like a pretty weak attempt to generate LLM anthropomorphization clicks.
-
Comment on How I feel about LLM (AI) writing in ~tech
post_below Link ParentSouless is a great way to articulate it. I've often used the word hollow but I think I like that better. There's no substance at the core, even if the parts contain valid informationI don't really know how to describe it but the writing just feels soulless. Usually with writing, I can usually understand the writer's intent and their thought process as I go through a piece from top to bottom. But with AI writing, it just feels like words on a screen. I don't really know how to articulate this better haha.
Souless is a great way to articulate it. I've often used the word hollow but I think I like that better. There's no substance at the core, even if the parts contain valid information
-
Comment on How I feel about LLM (AI) writing in ~tech
post_below Link ParentThat's definitely a grey area, I personally wouldn't say there's anything wrong with using AI to translate a thoughtful comment into another language (or clean it up in the same language) but it...That's definitely a grey area, I personally wouldn't say there's anything wrong with using AI to translate a thoughtful comment into another language (or clean it up in the same language) but it would also be an easy way to justify outright generated content.
-
Comment on How I feel about LLM (AI) writing in ~tech
post_below Link ParentThat's a really good summary of a lot of AI writing currently. As far as the humanities, I agree completely but I'm still concerned about what it will look like a year from now. It's possible...But if you read it, it’s a really shallow book-report level of analysis being dressed up to seem more impressive than it is by very professional sounding prose.
That's a really good summary of a lot of AI writing currently.
As far as the humanities, I agree completely but I'm still concerned about what it will look like a year from now. It's possible they'll never be able to approach thoughtful human level output, but it's too early to tell.
-
Comment on How I feel about LLM (AI) writing in ~tech
post_below Link ParentThat's the link I would have suggested too. There are a few other attempts to list AI writing tells floating around but I don't have the links saved. The wikipedia article is probably the most...That's the link I would have suggested too. There are a few other attempts to list AI writing tells floating around but I don't have the links saved. The wikipedia article is probably the most comprehensive, though it's missing some important ones.
Really the best way to spot AI writing is to read enough of it that your language center learns to spot the patterns intuitively. Which in my experience happens pretty quickly, brains are amazing at pattern matching.
-
Comment on How I feel about LLM (AI) writing in ~tech
post_below Link ParentAs with most buzzy terminology, it's broad and abstract and everyone defines it a little differently. Personally I'd define it as content that's hollow. Low signal to noise ratio, overuse of...As with most buzzy terminology, it's broad and abstract and everyone defines it a little differently.
Personally I'd define it as content that's hollow. Low signal to noise ratio, overuse of rhetorical tricks, overly confident assertions that sound definitive but fall apart under non-cursory examination, and so on.
But that's kind of abstract, it could also be defined as LLM generated content with insufficent human participation in the process.
Note that humans can create slop entirely without the help of AI but these days the majority seems to be AI generated. If you're lazy and uninvested in quality, there's no reason to make the content manually anymore
-
Comment on How I feel about LLM (AI) writing in ~tech
post_below Link ParentFor whatever it's worth, I don't think people are commenting using AI on Tildes. Or at least I haven't noticed it. Sometimes AI generated posts and articles are shared, but even that doesn't seem...For whatever it's worth, I don't think people are commenting using AI on Tildes. Or at least I haven't noticed it. Sometimes AI generated posts and articles are shared, but even that doesn't seem to be happening that often (yet). On the rest of the internet it's a different story.
-
Comment on How I feel about LLM (AI) writing in ~tech
post_below Link ParentI love the idea of some sort of slop label, and it wouldn't work for exactly the reasons you mentioned.I love the idea of some sort of slop label, and it wouldn't work for exactly the reasons you mentioned.
-
Comment on How I feel about LLM (AI) writing in ~tech
post_below Link ParentMost of it seems really obvious to me too, but I think there's a salience point where that happens. I know my first few interactions with (modern, say mid to late 2025) LLM writing I thought it...Most of it seems really obvious to me too, but I think there's a salience point where that happens. I know my first few interactions with (modern, say mid to late 2025) LLM writing I thought it was much better than it actually is.
I'll take your word the Pangram catches all the obvious slop, I've only used it a few times with somewhat mixed results. It does seem to be better than other detectors out there. But for example, I just prompted 3 paragraphs of slop using guidelines to avoid certain AI tells and Pangram scored it "100% human". To be fair though, in addition to avoiding tells, I also gave it a lot of grounded facts to work with so it didn't need to pad or inflate. And I suppose you're right, that's way more useful than most of the current AI blog posts, which have 2 or 3 facts and a lot of puffery and confabulation.
-
How I feel about LLM (AI) writing
I love writing, it's one of the most human things about humanity. It's communication, art and sharing all at once. It's been fundamental to culture and progress for 1000's of years. LLMs are, in a...
I love writing, it's one of the most human things about humanity. It's communication, art and sharing all at once. It's been fundamental to culture and progress for 1000's of years.
LLMs are, in a way, really good at writing. They have the larger part of human creative output distilled into their weights. So it was inevitable that more and more people would start publishing articles and blog posts written (all or in part) by AI agents.
I don't like it but I accept it, there really isn't anything I can do about it. What I was hoping, though, is that high signal to noise ratio places on the internet (Tildes among them) would reject it and we could go on consuming 100% organic prose, at least for a while.
And for while that's exactly what happened. In techy places like Hacker News, AI posts were quickly flagged and downvoted into oblivion. At Tildes they mostly didn't show up at all, or if they did I missed them.
That seems to be ending though. Now I see agent written pieces on the front page of HN with 100's of comments. There's always a highly upvoted comment pointing out that the piece is slop, but you have to scroll to find it.
The reason I use HN as an example is that it's full of people with extensive experience using AI agents who are in a position to tell if something is slop. And it looks like the larger part of readers (or at least commenters) can't tell the difference anymore. If that's true at HN, it's going to be true everywhere.
It is getting harder to tell when something is slop, people are post editing, handwriting intros and getting better at prompting to remove obvious LLM tells. But if you have any practical experience with these tools, it's still pretty easy to tell. Somewhere during post training certain patterns end up getting heavily favored. Interestingly, many of them happen across all of the frontier models. Em-dashes are the most famous but there are so many more. Most are rhetorical tricks or formatting patterns rather than punctuation.
Reading LLM prose, many of the tropes don't stand out at first, instead they land as strong writing. But after you see them repeat enough times they start to become obvious. Even putting the tropes aside, the hallmark of a lot of LLM writing is that it's more rhetoric than substance. Low signal, lots of noise.
I don't have a solution, it's starting to look like many (maybe most) people aren't going to be able to tell when they're consuming something that required minimal thought by the "author" who prompted the AI. Which is sad because, up until now, we could assume that, when we read something, someone cared enough to put time and mental bandwidth into creating it. That's become increasingly less true.
I suppose this post is me feeling wistful for the internet we used to have, written exclusively by humans. I continue to hope that people will reject slop at places like Tildes, but in order for them to do that they have to be able to identify it. Maybe people will get better at that, there is definitely a point where you've consumed enough slop that you can smell it from a mile away. But of course the slop is going to keep getting harder to detect.
I don't want to go as far as to say that slop will take over the internet, I think (hope) that people will keep wanting to read organic, human, writing. And that as a result we'll come up with strategies and solutions to support that.
It's a weird time. Right now every LLM blog post and article that goes viral is signalling to the prompter, and anyone watching who can tell what's happening, that there is demand for slop. And of course with demand comes profit. I think we're at the beginning of a steep curve.
42 votes -
Comment on The boy that cried Mythos in ~comp
post_below (edited )Link ParentIn my opinion writing blog posts with an LLM is disrespectful to readers for all sorts of reasons. This particular piece is full of confabulation that relies on the reader not fully understanding...In my opinion writing blog posts with an LLM is disrespectful to readers for all sorts of reasons. This particular piece is full of confabulation that relies on the reader not fully understanding what the terms mean. It's written as though it's a slam dunk when really it's mostly just noise, much of it recycled from posts published shortly after the Mythos release.
The post doesn't exist to share knowledge, it's engagement bait slop, there's insufficent human care and thought put into it.
Using an LLM means you're not allowed to be critical of the companies that sell them or hype them up?
No... That's a remarkable leap for someone who's rubbed wrong by logical fallacies.
-
Comment on The boy that cried Mythos in ~comp
post_below LinkThe irony of this piece is that you can play count the LLM writing tropes in it. If you make it a drinking game you'll pass out before the end. My takeaway: what's the point? Mythos will be...The irony of this piece is that you can play count the LLM writing tropes in it. If you make it a drinking game you'll pass out before the end.
My takeaway: what's the point? Mythos will be released publicly at some point and everyone can check for themselves. Until then, outsider speculation isn't adding anything useful to the conversation.
But if we're speculating... Based on how the technology works, and the history of model releases, most likely it'll be a sonnet to opus level of improvement rather than a game changer.
But the enhanced ability to turn bugs into working exploits likely really does justify caution. Which is not to say marketing wasn't a primary consideration.
-
Comment on What is Mastodon for? in ~tech
post_below LinkMastodon shares the same issue with other federation attempts: The UI is high friction. Or to put it less charitably, it's just bad. I saw the comparison to Tildes' invite only gate in other posts...Mastodon shares the same issue with other federation attempts: The UI is high friction. Or to put it less charitably, it's just bad.
I saw the comparison to Tildes' invite only gate in other posts in this thread. The thing is, in the case of Tildes, it's friction that's very easy to understand. You don't get dropped into unfamiliar territory with the expectation that you make decisions which you can't possibly have sufficient context to make in an informed way (unless you're following a tutorial or have someone walking you through it).
It's easy to underestimate how much of a barrier that is. People have a lot to deal with in their lives, UX that doesn't account for that usually fails with general audiences. It's not even about whether it's difficult or complex, it's about whether it feels difficult or complex.
Having spent time thinking about UX in practical application, I find the design of something like Mastodon anti-social. It says "we're not going to care about how this feels to an average person". But the tool is for humans! Have empathy for them.
So that kind of design, one which doesn't respect the average person, sends a strong message that it's not a platform for the average person. Fediverse solutions were never going to replace, or even be a viable alternative to, mainstream solutions because they never really tried. They're tech solutions for tech people, and not even most of them.
I had no problem wasting some time finding in invite for Tildes. It was friction but it was immediately apparent what Tildes was and how to participate. It didn't feel unfamiliar.
I also wasted time figuring out Mastodon and Lemmy, but in that case it was because I'm a tech person and I want to understand the landscape. It felt unfamiliar and therefore unwelcoming. At no point in the process could I imagine an average person feeling good about it.
Still, if the quality of the content or conversation I'd encountered was significantly better than what you find elsewhere, I could imagine word of mouth slowly driving adoption until a theoretical critical mass created enough interest that people who understand UI/UX would be motivated to make it less bad. But the problem is that the average content on fediverse platforms isn't better at all. If anything it's a little worse, possibly because the experience tells a lot of great people with full lives that "this isn't for you".
Not to say that Mastodon isn't great for the people that like it. It only stops making sense when you compare it to something else. It's a category of its own, it won't be replacing anything any time soon.
-
Comment on The zero-days are numbered — Firefox team uses AI to find and fix vulnerabilities in ~tech
post_below LinkIt's unlikely we'll get to "zero bugs" any time soon, that's pretty hyperbolic. Security vulnerabilities being a subset of "bugs". We're already at a place, though, where agent automation can find...The defects are finite, and we are entering a world where we can finally find them all.
It's unlikely we'll get to "zero bugs" any time soon, that's pretty hyperbolic. Security vulnerabilities being a subset of "bugs".
We're already at a place, though, where agent automation can find more bugs, faster, and for less money, than a team of humans can. Humans can still find things the agents miss, but Mythos is clearly finding more things that humans have missed than the reverse.
The thing that's interesting to me is that we're watching agent bug testing become pretty much mandatory for large software companies. If there's any chance that an AI agent can find security bugs that humans would miss, then they have no choice but to use agents, because malicious actors will otherwise use them to find the same bugs.
I think the post is right to imply that this shifts the advantage towards creators/defenders. Using agents for vulnerability review is (relatively) cheap and accessible and you have the advantage of free access to the source code. Meanwhile on the black hat side you need expertise, funding and expensive black market tools that AI agents could in theory make partially obsolete. And if black hats want to use frontier models to find vulnerabilities, they need to figure out how to defeat ever improving guardrails against malicious use. They can't switch to different models without guardrails because those models aren't good enough (so far) to be useful for security research against code secured by frontier models.
The arms race will continue, probably forever, but AI agents are good news for the white hat side.
-
Comment on Matt Mullenweg says “the wheels have fallen off” in wide-ranging WordPress critique in ~tech
post_below LinkIt's hard to separate Mullenweg's abusive, self (and company) destructive behavior in the last few years from a rapidly changing landscape. Prior to 2026, people looking for alternatives to...It's hard to separate Mullenweg's abusive, self (and company) destructive behavior in the last few years from a rapidly changing landscape. Prior to 2026, people looking for alternatives to Wordpress were definitely driven there by the fallout of Matt's actions. Going forward though, it's just not the same technological environment. Wordpress is undoubtedly past its peak.
Maybe it's a good time to put aside the current state of things long enough to acknowledge how great Wordpress was for the open web. The reason it powers ~40% of the web is that it's an alternative to corporate solutions which the community could rally around and businesses could adopt without spending too much or getting locked in to a closed source provider. Both at the platform level and the whole stack. A wordpress website wasn't just running on an open source platform, it was also on an open source LAMP stack. It was a constant reminder that boring, community powered, software could work in an increasingly corporate controlled internet.
It was a very very good thing in that regard.
I've been hiking and backpacking for a long time, in a lot of different conditions, and I've never encountered, or even heard of, a situation where someone was injured and the outcome was significantly affected by first aid supplies. Not saying it doesn't happen, but it's so rare that worrying about it seems like a waste of bandwidth.
The most important things to have with you are water purification and GPS. Offline mapping on your phone works great. Solar charging or a backup battery are nice.
If you're backpacking you're already carrying everything you need to survive for a long time if something goes wrong. If you're day hiking, and you don't fall off a cliff, you're rarely far from help.
One thing that's useful for overnights is the right clothes. Body temp is the next most important thing after water. Wool is magic, cotton is not your friend.
I suppose if you experience a lot of anxiety, feeling like you're prepared might be more important. And honestly in that case you don't want a splint or clotting agents, you want satellite communication and hiking buddies.