post_below's recent activity
-
Comment on Something big is happening in ~tech
-
Comment on Something big is happening in ~tech
post_below (edited )LinkSince the author is talking about coding as the killer use case that proves all the future use cases are coming... I want to add a sanity check from the perspective of someone with decades of...- Exemplary
Since the author is talking about coding as the killer use case that proves all the future use cases are coming... I want to add a sanity check from the perspective of someone with decades of software engineering experience and as much experience with modern LLM agents as anyone has at this point.
But first I want to acknowledge that he's right about a lot of what he's saying. These tools are more powerful than most people realize at this point. They absolutely are going to change everything on a scale not seen since the widespread consumer internet. And it's going to happen faster than the internet did. It's going to happen too fast.
That said, here's how you know you're reading hype: He never mentions that these tools are also drooling idiots. Maybe he really doesn't know. It's hard to imagine how that could be true, but I want to allow for the possibility that he really believes everything he's saying.
What I mean by that is that this author, and so many others before him, seem to be skipping over big chunks of the current reality and leaping forward into what might happen in the future. The truth is that, for coding, AI agents are miraculous. He's right about that. And also, they absolutely cannot autonomously create complex production level code to professional human standards. They just can't.
However, they appear to. The SOTA is in this odd place where agents can write large, fully functioning, applications that meet most of the specs and pass all of the tests. Which is mind blowing, groundbreaking, science fiction level stuff. While at the same time under the hood there are security flaws, bad patterns, wildly varied conventions and style, performance problems, redundancy, insane verbosity and so on. And the only thing that can fix those issues (or stop them from happening in the first place) is a human in the loop.
So on the surface it looks like a miracle, but underneath it's a mountain of tech debt and vulnerabilities just waiting for the right moment to fuck up everyone's day.
I feel like I should establish that what I'm talking about generalizes, as opposed to being the result of my not understanding how to use the tools. I've been using them extensively for quite a while now (in AI years). I have scaffolding and custom built tools and extensive initial context and skills and commands and hooks and custom sub agents and all the things. Each of them iterated and pruned and updated for the latest generation countless times in an attempt to make the agents more reliable and less idiotic. And it works, some of the scaffolding I came up with in 2025 is now built into the latest SOTA harnesses. I don't say that to paint myself as some sort of visionary, this is all new territory that everyone is figuring out together, a lot of people have organically converged around various obviously effective strategies that the frontier labs then adopted. My point is only that I'm holding them right. I can get coding agents to do all sorts of exciting and useful things and I believe I have a solid, realistic understanding of what they're capable of and what their limits are. With humans in the loop they redefine software engineering. Without humans in the loop they are just very very impressive tech demos.
That could all change, they could get to the point where there don't need to be humans involved. If that happens then everything the author is saying is true and he's maybe not even stating it strongly enough. But it hasn't happened yet. The people who are saying it has are either deluding themselves or exaggerating for cynical reasons. I expect the fallout of that delusion to be difficult to miss in the software industry in the coming months and years.
I can prove it to you (here you can TLDR to the end if you don't care about using coding agents)
Assuming you have a subscription with one of the SOTA companies (for coding you want Claude Code with Opus 4.6 or Codex 5.3 high) that covers the necessary tokens.
First you'll want a decent AGENTS.md or CLAUDE.md for initial context. You can find decent starter context online if you don't want to spend too much time. Pick something reasonably lean, you don't want to use up too much context out of the gate. We can skip all of the more in-depth stuff for now.
Next, give the agent a general spec for a non-trivial application that has a lot of user facing surface area. The more varied the surface area the better. It should ideally be a big enough application that the agent can't one shot it in a single context window. With current context limits that isn't too hard to do (unless you're paying a premium for an extra large context window). It should attempt to solve a problem that's not completely overdone (no glorified to do list apps).
Next have the agent work your prompt up into a detailed implementation plan and have it write that plan to an .md file. If you have the time ask it to run a Q&A session with you to refine the plan.
Then instruct it to implement the plan while keeping track of its progress. This is a key step because you'll need to feed the plan and current progress into a new session when your agent runs out of context, or you can have the agent hand off to a new version of itself automatically, or let it do context compaction and soldier on in the same session. Or if you have a really big subsciption you can have an orchestrator agent run a bunch of sub agents automatically until the plan is finished. There are various ways to do it, each with pros and cons. Make sure the plan it writes includes a detailed testing phase so that it can iterate on any issues until it has something that works. You'll want to have some sort of browser (or device) automation wired up so it can test the UI/UX along with the backend. That's easy to do these days, the providers have solutions already built, or you can ask the agent to do it for you.
Then, assuming you've given it sufficient permissions so that it doesn't need to check in with you, go do something else for a while. Sleeping is a great option.
When you wake there's a fair chance (but not whatsoever guaranteed) that you'll be waking up to a working application that looks quite a lot like what you asked for. If it's your first attempt you're welcome to take all the time you need to wait for the world to stop spinning.
If it app isn't working yet, you should be able to prompt the agent into getting it there fairly easily, but it depends on how hard the set of problems you're trying to solve are.
Once it's working it will be hard to deny that you just experienced some version of the future.
But now the next step is to ask another model to audit the codebase. For example, if you built it with Opus, ask Codex to take a look. It shouldn't cost more than about $5 in tokens for a thorough audit, a lot less if the codebase isn't too big. At the same time, start a fresh session with your main model and ask it to do an audit too. Have both agents write their findings to a file when they're done.
I guarantee the list of issues they find will be extensive and that it will reframe your perspective on the miracle you just experienced. But you're not quite done, instruct your main agent to fix all of the issues and then repeat the audit process. Prepare for another long (but shorter) list of issues. Keep repeating until the agents stop finding issues. Note that the audit prompt is important, it needs to be thorough. You can download pre-made skills for that if you're not a coder. Multiple specialized auditors with different disciplines works best (security, logic, maintainability, etc.)
Once the agents are satisfied that the codebase is perfect, take a look at the codebase yourself. Or if you can't fluently read code, bribe someone who can. If you are really doing a best effort code review, I absolutely guarantee you will find more issues, some of them shocking.
And that's doing the bare minimum to wrangle the agents, my overlong post could be 8 times as long with instructions on how to make the agents suck less and still at the end of the process you would be finding serious issues.
That's the (real) current state of the art in autonomous coding agents and no amount of promot engineering can navigate around it.
A human in the loop, on the other hand, makes for a very different outcome, that is until you get overconfident and let the agent write too much code without thorough review. Then, again, issues are guaranteed.
TLDR
All of this to say: It's still safe to ignore the hype from people like the post author. The AI apocalypse could come at any time, but it's not on the horizon yet based on the current state of the tech.
And also, listen to the more level headed people who are saying this is a paradigm shift, because they are not lying.
-
Comment on How The New York Times uses a custom AI tool to track the “manosphere” in ~life.men
post_below Link ParentThat's a good point, finding out what the demographic is talking about without the mental and emotional load of wading through the toxicity could be useful outside of journalism too. I've been...if it's not painful and heated and time consuming to hear that the other side is saying
That's a good point, finding out what the demographic is talking about without the mental and emotional load of wading through the toxicity could be useful outside of journalism too.
Tracking what's controversial even within an echo chamber could be so important for a party trying to come up with something that would resonate
I've been getting a handful of updates in various fields that are AI summarized digests of what people are saying in online spaces (Reddit, Twitter, Discord, etc.) that I'd never otherwise spend enough time on to have a sense of the pulse. Complete with source links in case my faith in the human race is feeling strong enough to read the actual comments section. I could see a larger scale version of that being really useful politically.
-
Comment on How The New York Times uses a custom AI tool to track the “manosphere” in ~life.men
post_below LinkThis is a good example of an AI use case for a task that likely wouldn't be done otherwise (listening to loads of manosphere podcasts and summarizing them every week forever). There's a job no one...This is a good example of an AI use case for a task that likely wouldn't be done otherwise (listening to loads of manosphere podcasts and summarizing them every week forever). There's a job no one at the NYT wants.
It replaces a lot of human hours but they're hours that likely wouldn't have been spent in the first place because it's not worth the resources (or pain). No doubt they have summaries setup for other "spaces" too.
Really smart use of LLM tools. This isn't the first time I've been surprised by the NYT's tech savvy
-
Comment on AI doomers: What uses of generative AI are you actually excited about? in ~tech
post_below LinkI'm not sure about alarms but I know someone with hearing issues and I looked into it recently, there are at least a half dozen companies making live caption glasses for hearing impaired people. I...feels both achievable with current tech and also pretty awesome
I'm not sure about alarms but I know someone with hearing issues and I looked into it recently, there are at least a half dozen companies making live caption glasses for hearing impaired people. I agree, it's a fantastic use case.
A few others: research, in so many fields but especially in genomics. Medicine, not to replace doctors but to improve the process (screening, data management, diagnostic aids), modeling (weather, climate, geology, etc..), gaming (the most popular speculation is MUCH better NPCs). Also world building in general, for games of course but the possibilities for "world models" in a variety of areas are near unlimited. If you can train on the "world" of a domain (say, an ecosystem), the resulting inference could be remarkably useful. I expect there will be a lot of annoying hype around world models this year.
Regarding spiraling, I get it. I don't want to take this thread into the downsides direction but they're as unlimited as the upsides.
In spite of that, it's going to happen no matter what we think about it. We can hopefully support the creation of regulation and guardrails but one way or another the technology is going to keep exploding in all directions.
If some day we look back and collectively decide that AI was a huge mistake, not one of us will be able to realistically say "I should have done more to stop it" because there's just not currently anything an individual can do relative to the unprecedented amount of capital involved. Even large groups of individuals don't stand a chance.
I don't say that from a defeatist perspective. At any point in history there are large scale developments and circumstances beyond an individual's control. With just the one life each, I think we should appreciate the upsides where we can and focus our energy in the places we actually can make an impact. There have always been too many problems for one person to solve.
-
Comment on Building a C compiler with a team of parallel Claudes in ~tech
post_below Link ParentTo be fair to them, they didn't claim it was glorious in the blog post. It mentions that, with all optimizations enabled, it performs worse than GCC with all optimizations turned off. It also...To be fair to them, they didn't claim it was glorious in the blog post. It mentions that, with all optimizations enabled, it performs worse than GCC with all optimizations turned off. It also talks about the code quality being sub par.
The frustrating thing is that's not likely how the media and bloggers will talk about it. It will be another round of AI doom "it's coming for your job". It will fuel the hot takes that AI can now truly just write software. It will help suck in a new round of vibe coders. Except this year they want to be called "vibe engineers".
What it really is, like Cursor's far worse and more expensive example before it, is a somewhat interesting proof of concept. A few short years ago the possibility of agents creating any non trivial application autonomously was absurd.
I hate the hype too. But if the well wasn't poisoned by hype, and the airwaves saturated with AI discussion, we'd all be at least a little bemused by this.
-
Comment on Any software engineers considering a career switch due to AI? in ~comp
post_below LinkOption 2 is to build your own thing, which you can get started on any time, even keeping your current job and income. It's not for everyone but you'd get to decide exactly how much hands on...I really love building stuff and solving problems so maybe I go back to school and switch to some other flavor of engineering
Option 2 is to build your own thing, which you can get started on any time, even keeping your current job and income. It's not for everyone but you'd get to decide exactly how much hands on building and problem solving you'd get to do. The trick IMO is finding a problem you really care about solving, rather than solving a problem just to make money.
-
Comment on Passing question about LLMs and the Tech Singularity in ~tech
post_below Link ParentAh, yes in that context we're nowhere near an explosion. Or at least the existing technology doesn't put us near one, who knows if there will be breakthroughs in the near future. Yes LLMs are...Ah, yes in that context we're nowhere near an explosion. Or at least the existing technology doesn't put us near one, who knows if there will be breakthroughs in the near future.
Yes LLMs are already helping move the technology along faster than humans alone could do it. I don't think there's any doubt of that. The only question is if the path leads to the vicinity of AGI, which I think is safe to answer yes. It doesn't matter of LLMs themselves will have anything to do with AGI, they will definitely accelerate many aspects of technological advancement and some of them will contribute to eventual AGI.
-
Comment on The AI industry doesn’t take “no” for an answer in ~tech
post_below Link ParentI'd replace humans with mammals, and it's a well known part of the process of evolution. Calories are historically expensive and both movement and cognition use a lot of calories so organisms...I'd replace humans with mammals, and it's a well known part of the process of evolution. Calories are historically expensive and both movement and cognition use a lot of calories so organisms evolve to be as lazy as they can get away with while still surviving effectively.
-
Comment on Passing question about LLMs and the Tech Singularity in ~tech
post_below LinkAs others noted, LLMs don't exactly have "code". However, LLMs have been used as a part of the LLM training process for some time. In that sense, yes absolutely. In another sense: the agent...In other words, are we (the humans) already starting to use LLMs to improve their code faster than we humans alone could do?
As others noted, LLMs don't exactly have "code". However, LLMs have been used as a part of the LLM training process for some time. In that sense, yes absolutely.
In another sense: the agent harness, which is an increasingly important part of how effective LLM powered agents are at real world tasks, also yes. The big model companies use coding agents extensively when creating the harness and scaffolding, which shows in the sheer volume of bugs the harnesses have.
Wouldn't this be the actual start of the predicted "intelligence explosion"?
I didn't read the article you linked so I may not have all the nuance about what intelligence explosion refers to. But in terms of practical application of AI agents at real world tasks we are undoubtedly in the midst of an intelligence explosion.
In terms of actual intelligence, as we might define it, LLMs arguably have none at all. It depends on how you frame it. They provide an illusion of intelligence that is so good they can do intelligent things, which is to say things that could previously only be accomplished with human intelligence. But it's achieved, essentially, through advanced pattern matching rather than anything that could be described as understanding. It's hard to imagine a path from there to true AGI but at the same time it's not difficult to imagine that at some point the illusion of intelligence could get so good that it's practically the same as the real thing for many applications.
It's all so weird.
-
Comment on Why there's no European Google? in ~tech
post_below LinkI missed this thread when it was posted but my late entry is this: Putting the author's steriotypically French delivery aside, I think their point is fantastic. What if we looked at value...I missed this thread when it was posted but my late entry is this: Putting the author's steriotypically French delivery aside, I think their point is fantastic. What if we looked at value differently? We should do that.
What if Torvalds is more successful, as a human being, than Jeff Bezos? If the score isn't kept in dollars it's not a difficult case to make.
I know this seems naive to the point of absurd from an American perspective, where the market devours idealism without even trying, but from humanist perspective it's entirely valid.
-
Comment on Someone made a social media website for AI agents in ~tech
post_below Link ParentI doubt the goal of this project is training, more likely it's just a way to create buzz. However if it was about training, it's a dramatically cheaper way to create slop than paying for your own...I doubt the goal of this project is training, more likely it's just a way to create buzz. However if it was about training, it's a dramatically cheaper way to create slop than paying for your own tokens. You get hordes of other people to spend tokens for you.
-
Comment on New books aren’t worth reading in ~books
post_below LinkAlternative title: "New xitter posts are not worth reading" But I did read it, and now I'm here with my life and perspectives unimproved and unchanged. The post is clear ragebait, it's...Alternative title: "New xitter posts are not worth reading"
But I did read it, and now I'm here with my life and perspectives unimproved and unchanged. The post is clear ragebait, it's confrontational out of the gate, it just assumes rage. Maybe that's just how everyone talks on Xitter now, I don't go there often. It also establishes "this debate" out of thin air, though the nature of the debate is unclear. I assume they're using debate as a way to denote the rage they expect to illicit. Then at the end it just fizzles, having failed to pay off any of its claims.
I didn't read the replies because I don't want to log in but I can guess what they look like. With 300k views, the post accomplished its goal: Engagement with very little investement in time or mental energy on the author's part. The post is more interesting if you read it as a thinly veiled cry of frustration at the mental state of being overly online that the author is drowning in.
However, I enjoyed this bit:
The average ancient historian led troops, tutored a prince, governed a province, advised a king, made a fortune, fell from favor, was exiled, and buried 7 of their 10 children. The average modern historian passed a few tests then wrote a book on their laptop next to their cat.
It's kinda true! I disagree with most of the rest but that part changed the shape of my mouth.
-
Comment on Gold tops $4,900/oz; silver and platinum extend record‑setting rally in ~finance
post_below Link ParentCongrats on the success of your investments! Usually gold only behaves like this when there's a recession, and it's fantastic to get in early enough because there's so little risk that way....Congrats on the success of your investments! Usually gold only behaves like this when there's a recession, and it's fantastic to get in early enough because there's so little risk that way. Historically, while gold can drop in price, it usually stabilizes pretty quickly and, so far, always comes back at some point. If you buy into one of its upswings early enough it's very safe relative to the potential returns.
It's been wild the last couple years, pretty much every major financial institution has been low on projections and has revised their numbers upwards multiple times a year. Currently all the institutions that haven't already posted new predictions in January are due to upgrade their projections in the next month or so because the price is already beyond their last quarter of 2026 projections and all of the signals are still strong (de-dollarization, geopolitical instability, central bank diversification, dollar inflation, falling interest rates, economic uncertainty). It's a perfect storm that sucks overall but it's good for precious metals.
-
Comment on Microsoft gave US FBI keys to unlock encrypted data in ~society
post_below Link ParentI just used (part of) the title from the article. I'm not attached to it if someone wants to change it. However it's technically accurate, the article talks about specific examples.I just used (part of) the title from the article. I'm not attached to it if someone wants to change it.
However it's technically accurate, the article talks about specific examples.
-
Comment on Microsoft gave US FBI keys to unlock encrypted data in ~society
post_below LinkJust a heads up. For the moment it's still possible to use Windows without being logged in to a MS account and, even if you are logged in, you can choose not to store your bitlocker keys in the...Microsoft confirmed to Forbes that it does provide BitLocker recovery keys if it receives a valid legal order. “While key recovery offers convenience, it also carries a risk of unwanted access, so Microsoft believes customers are in the best position to decide... how to manage their keys,” said Microsoft spokesperson Charles Chamberlayne.
He said the company receives around 20 requests for BitLocker keys per year and in many cases, the user has not stored their key in the cloud making it impossible for Microsoft to assist.
Just a heads up. For the moment it's still possible to use Windows without being logged in to a MS account and, even if you are logged in, you can choose not to store your bitlocker keys in the account.
-
Microsoft gave US FBI keys to unlock encrypted data
37 votes -
Comment on Wilson Lin on FastRender: a browser built by thousands of parallel agents in ~tech
post_below Link ParentIt's too bad to see him doubling down. I finally got around to watching the video interview, or most of it, and in the CNN website part (the only part that wasn't cherry picked by the Cursor dev,...It's too bad to see him doubling down. I finally got around to watching the video interview, or most of it, and in the CNN website part (the only part that wasn't cherry picked by the Cursor dev, likely with pre-cached elements... Simon (or whoever was controlling the cursor at that point) starts to scroll down and quickly stops when it becomes apparent that there's just blank space below the fold. Simon communicated more about his intent by pretending not to notice that than anything he wrote in his post.
-
Comment on Federal officers kill another citizen in Minneapolis, National Guard activated in ~society
post_below Link ParentI don't really have anything useful to add. It's horrifying. Both this time, the last, and the overall frequency of murders. You'd think they would have walked it back for a least a little while...I don't really have anything useful to add. It's horrifying. Both this time, the last, and the overall frequency of murders. You'd think they would have walked it back for a least a little while after Renee Good.
It's like they're pushing harder, to communicate that yes, they actually can get away with executing people with impunity.
-
Comment on Wilson Lin on FastRender: a browser built by thousands of parallel agents in ~tech
post_below Link ParentAdd to this that when people tried to compile and run it shortly after the release, they couldn't get it to compile, nor could the majority of previous versions compile. When it finally did start...Add to this that when people tried to compile and run it shortly after the release, they couldn't get it to compile, nor could the majority of previous versions compile. When it finally did start working there were some unusual commits shortly before that some speculated were actual humans trying to duct tape it together. Disclaimer: that last bit is entirely a rumor as I didn't look at the code or try to compile it myself.
The reason I didn't look, aside from lack of interest, is that I know what GPT and Claude output under the best of circumstances, and it's not something you can mash together into a working browser from scratch. It's not even close.
But with $80,000 in tokens (their estimate), you can get it to pull together a bunch of libraries to do the real work and end up with a demo that works in the sense that you can get it to kind of display a web page but fails to be actually useful for any practical application. A handful of humans could do better in less time with a bar that low.
Willison posts great stuff, I enjoy his blog, but a puff piece is the wrong angle here. This was a publicity stunt for Cursor, relying on the AI crazed tech media not asking too many questions. Simon is an engineer, he could have told a much better story about what Cursor "achieved".
It is a really interesting proof of concept about agents orchestrating themselves, but what it also proves is that even with a blank check and a server farm agents can't make usable, sophisticated software themselves.
Another missing part of the story: Cursor's user base is increasingly vibe coders. Engineers have been switching to better options in droves for at least the last 6 months, which accelerated with the release of Sonnet 4.5 and then Opus 4.5. And started moving even faster when they scrapped their unlimited auto loss leader. So a "demo" like this appeals directly to their target audience of people who can't read the code, and therefore don't know it's slop.
That's really cool, let me be the first (Tildonite) to validate your experience! AI can be super fun while being genuinely useful.
The newer generations of coding agents are perfect for what you're describing, and you're obviously using them well to get the results you're getting.
Just one caveat: If you decide to publish some of your apps, don't collect user data. And be cautious about your own PII, secrets and financial info. Without being able to (fully) read the code you can never know if you're being responsible with that data and it's pretty much guaranteed at this point that you wouldn't be.
Outside of that, go wild.