I think this article has a very fair take on the entire AI hype that is going on. Not just shutting it down, but actually exploring the claims and adding a bunch of nuance. They mention the...
I think this article has a very fair take on the entire AI hype that is going on. Not just shutting it down, but actually exploring the claims and adding a bunch of nuance. They mention the Ludicity blog and the article in question has been posted on Tildes before. Some people don't vibe well with the tone of the Ludicity blogs, I think that that Colton's touches on many of the same points in a much more measured way. So for those people who didn't really like the Ludicity blog, you might find this one better to get through.
Anyway, some things that stood out to me.
And it was... Fine. Despite claims that AI today is improving at a fever pitch, it felt largely the same as before. It's good at writing boilerplate, especially in Javascript, and particularly in React. It's not good at keeping up with the standards and utilities of your codebase.
This has my experience as well. For one of javascript scripts this is very neat, for anything more complex you quickly will run into various limits. One thing I have noticed across models (GPTs, Gemini, Claude) is that they also tend to switch up code practices and do so in one conversation chain. With nodeJS for example they will use the older require syntax and then suddenly switch to import. I have to wonder how many vibecoders have been bamboozled by that one.
I think a lot of the more genuine 10x AI hype is coming from people who are simply in the honeymoon phase or haven't sat down to actually consider what 10x improvement means mathematically. I wouldn't be surprised to learn AI helps many engineers do certain tasks 20-50% faster, but the nature of software bottlenecks mean this doesn't translate to a 20% productivity increase and certainly not a 10x increase.
This is also the impression I get from many people.
Look, I'm not an AI startup hater. If you want to plug OpenAI's API into your healthcare startup I might raise an eyebrow of concern over the risks, but I'd do the same for any startup desiring to move fast and break things in the medical field. My goal here isn't to say AI startup founders or investors are evil or even dishonest. My point is to say in the droll voice of your high school Econ 101 professor, "Incentives Matter".
A know quite a few people on tildes also tend to visit HackerNews quite frequently. Something to keep in mind is that HackerNews is not representative of the entire tech space as it is heavily aimed at startups and all the incentives that come with chasing VC money. What many people also don't realize is that HackerNews actively moderates against a lot of critical posts.
One thing I've noticed about all these characters in AI coding hype pieces is there is almost always a degree of separation from the writer to the actual productivity benefits. The poster is a founder, or a manager, or an investor, making grandiose claims about someone else's productivity. There's nothing wrong with secondary sources but if you can't find a primary source, you might start questioning the reliability of the information.
Something I also have noticed. Not only that, if the majority of AI work is supposed to be done by agents, capable of doing the entire process including making PRs. Then, why isn't there an explosion in such PRs on a large amount of open source projects? Even more so, why am I not seeing these PRs on AI related open source projects? If I need to target it even more directly, why am I a not seeing hints of this being applied on code agent repositories?
Call me naïve, but you'd think that these specifically want to demonstrate how well their product works. Making an effort to distinguish PRs that are largely the work of their own agents. Yet, I am not seeing that mostly these "secondary" sources and a lot of "trust me, it is there and it as amazing". Which simply isn't backed by reality as the author also mentions.
Presentations from actual engineers demonstrating how they achieve more productivity with AI are much more varied and much more muted in their praise. These demos show largely AI as the same technology you and I were familiar with before we got so anxious: a neat text generator that sometimes does magic but often requires you to take the wheel.
AI usage on open source projects, where the productive process can be publicly witnessed, has famously been a hilarious failure. I have learned things about how to use AI better from a few youtube videos. Here's a good one referenced in that Ludicity article above. I'll spoil it for you though, this engineer has not found the fountain of coding productivity.
They won't even notice. I still have to regularly explain why var is bad practice. Code style is the least of many people's concerns.
With nodeJS for example they will use the older require syntax and then suddenly switch to import. I have to wonder how many vibecoders have been bamboozled by that one.
They won't even notice. I still have to regularly explain why var is bad practice. Code style is the least of many people's concerns.
In this case it isn't just code style. Using require in a project where you have set "type": "module" in your package.json will cause issues and similarly if you use import when you haven't set...
In this case it isn't just code style. Using require in a project where you have set "type": "module" in your package.json will cause issues and similarly if you use import when you haven't set that.
Though I guess most models will be quick enough to fix it once you paste in the error with a "You are absolutely right that it doesn't work, I need to use <the other thing>. Here is the fixed version". So, most bamboozlement will be short lived.
I frequently notice the thing you mention about how copilot will output multiple incompatible JavaScript styles in the same code block (require and import). Then if I tell it that, it will output...
I frequently notice the thing you mention about how copilot will output multiple incompatible JavaScript styles in the same code block (require and import). Then if I tell it that, it will output “ok yeah you’re right, here’s some completely different code”. It does this when it supposedly has access to the whole project, and should know from the package.json and file extensions if I want old js. So the only thing I can take from this is that at best it gives a half-assed reply without checking all the known parameters.
And this is just the most noticeable stuff. It constantly makes rookie mistakes but sometimes they’re hard to catch.
Another thing I notice is that sometimes it is typing out a long answer and then deletes it all, then says it can’t answer that. I think there is some separate module that is trying to prevent some answer that they want to block for legal reasons.
See, you are doing it wrong, all the cool kids use claude code which actually does all the magic! (Sarcasm, just to be clear). But yeah, context is something that very much remains an issue. Even...
how copilot
See, you are doing it wrong, all the cool kids use claude code which actually does all the magic! (Sarcasm, just to be clear). But yeah, context is something that very much remains an issue. Even with models that supposedly have a bigger context window (like Gemini) they tend to lose the plot over a certain amount of tokens.
According to the crowd it means you aren't subdividing your tasks and asks properly. To me, it indicates something I already knew. These are neat tools for isolated questions about isolated units of your software. Which is how I tend to use them for and I don't really need agentic workflows and chains for if I am being honest.
Another thing I notice is that sometimes it is typing out a long answer and then deletes it all, then says it can’t answer that.
I had this happen with gemini the other day, where on a specific question it kept throwing technical errors.
Yeah, using Claude Code to contribute to open source projects is definitely how I first entered into "omg claude code is amazing, I can just have it grab a GitHub ticket and it just does all the...
Something I also have noticed. Not only that, if the majority of AI work is supposed to be done by agents, capable of doing the entire process including making PRs. Then, why isn't there an explosion in such PRs on a large amount of open source projects? Even more so, why am I not seeing these PRs on AI related open source projects? If I need to target it even more directly, why am I a not seeing hints of this being applied on code agent repositories?
Yeah, using Claude Code to contribute to open source projects is definitely how I first entered into "omg claude code is amazing, I can just have it grab a GitHub ticket and it just does all the work and even commits and pushes it". But what I quickly found was the opposite. Because it's hard to be able to start from a spec-driven originating point, it's hard to draw up proper Architecture for Claude to go off of. Adding on, that code style is 100x more important in open source repos than enterprise. All documentation needs to be thorough and accurate. And even with large context models, it just can't quite get there yet.
Hah. Definitely, let me dive deeper. And my opinion is definitely superseded by your own, as you have more experience with OSS. Well, I feel like maintainability is the name of the game with OSS....
Hah. Definitely, let me dive deeper. And my opinion is definitely superseded by your own, as you have more experience with OSS. Well, I feel like maintainability is the name of the game with OSS. That's the key factor. So one of the ways for maintainability to remain high, is for code style to be clean and predictable. It allows the contributors and maintainers to more easily approach the codebase. But again, I am so happy to consider alternatives that I haven't thought of because I have certainly not thought about this in depth
It's the same in many enterprise environments. At least the ones I am familiar with where code standards are there and reinforced. But, you are also really contradicting yourself here now. By all...
Well, I feel like maintainability is the name of the game with OSS.
It's the same in many enterprise environments. At least the ones I am familiar with where code standards are there and reinforced. But, you are also really contradicting yourself here now. By all accounts you initially said that LLMs will have more trouble with OSS software because code standards are higher. This in itself is already debatable. But if we go with that assumption and the code is also better documented it means that LLMs should have less issue with OSS code as it is more structured. Which would also make it much easier for these models to work with and comprehend.
This feels like the crux of it for me. So much of software engineering is people, and LLMs won't fix communication issues or point you to the subject matter expert. Faster coding is a marginal...
Would it be wrong for me to do "normal" coding if a higher output path is available?
No. It's okay to sacrifice some productivity to make work enjoyable. More than okay, it's essential in our field. If you force yourself to work in a way you hate, you're just going to burn out. Only so much of coding is writing code, the rest is solving problems, doing system design, reasoning about abstractions, and interfacing with other humans. You are better at all those things when you feel good. It's okay to feel pride in your work and appreciate the craft. Over the long term your codebase will benefit from it.
This feels like the crux of it for me. So much of software engineering is people, and LLMs won't fix communication issues or point you to the subject matter expert. Faster coding is a marginal improvement at best.
Just to continue the AI discourse from yesterday, thought I'd bring this article here. I saw this posted on HN today and found it interesting and convincing
Just to continue the AI discourse from yesterday, thought I'd bring this article here. I saw this posted on HN today and found it interesting and convincing
The AI will only generate boilerplate if that's all that you're telling it to do. I have greatly increased productivity using pointed, well written prompts that instruct the AI effectively....
The AI will only generate boilerplate if that's all that you're telling it to do. I have greatly increased productivity using pointed, well written prompts that instruct the AI effectively.
Basically, I think it's a little bit of user error. But it definitely doesn't make everyone a super engineer.
I marked this as noise. AI summaries are often enough wrong. Even if they aren't, they remove all nuance and encourage people to just respond to the summary. Which is slightly less bad than people...
I marked this as noise. AI summaries are often enough wrong. Even if they aren't, they remove all nuance and encourage people to just respond to the summary. Which is slightly less bad than people responding to titles, but still not something I feel should be encouraged on Tildes.
I think this article has a very fair take on the entire AI hype that is going on. Not just shutting it down, but actually exploring the claims and adding a bunch of nuance. They mention the Ludicity blog and the article in question has been posted on Tildes before. Some people don't vibe well with the tone of the Ludicity blogs, I think that that Colton's touches on many of the same points in a much more measured way. So for those people who didn't really like the Ludicity blog, you might find this one better to get through.
Anyway, some things that stood out to me.
This has my experience as well. For one of javascript scripts this is very neat, for anything more complex you quickly will run into various limits. One thing I have noticed across models (GPTs, Gemini, Claude) is that they also tend to switch up code practices and do so in one conversation chain. With nodeJS for example they will use the older
requiresyntax and then suddenly switch toimport. I have to wonder how many vibecoders have been bamboozled by that one.This makes most use cases I get out of these models neat but underwhelming.
This is also the impression I get from many people.
A know quite a few people on tildes also tend to visit HackerNews quite frequently. Something to keep in mind is that HackerNews is not representative of the entire tech space as it is heavily aimed at startups and all the incentives that come with chasing VC money. What many people also don't realize is that HackerNews actively moderates against a lot of critical posts.
Something I also have noticed. Not only that, if the majority of AI work is supposed to be done by agents, capable of doing the entire process including making PRs. Then, why isn't there an explosion in such PRs on a large amount of open source projects? Even more so, why am I not seeing these PRs on AI related open source projects? If I need to target it even more directly, why am I a not seeing hints of this being applied on code agent repositories?
Call me naïve, but you'd think that these specifically want to demonstrate how well their product works. Making an effort to distinguish PRs that are largely the work of their own agents. Yet, I am not seeing that mostly these "secondary" sources and a lot of "trust me, it is there and it as amazing". Which simply isn't backed by reality as the author also mentions.
They won't even notice. I still have to regularly explain why
varis bad practice. Code style is the least of many people's concerns.In this case it isn't just code style. Using
requirein a project where you have set"type": "module"in yourpackage.jsonwill cause issues and similarly if you useimportwhen you haven't set that.Though I guess most models will be quick enough to fix it once you paste in the error with a "You are absolutely right that it doesn't work, I need to use <the other thing>. Here is the fixed version". So, most bamboozlement will be short lived.
I frequently notice the thing you mention about how copilot will output multiple incompatible JavaScript styles in the same code block (require and import). Then if I tell it that, it will output “ok yeah you’re right, here’s some completely different code”. It does this when it supposedly has access to the whole project, and should know from the package.json and file extensions if I want old js. So the only thing I can take from this is that at best it gives a half-assed reply without checking all the known parameters.
And this is just the most noticeable stuff. It constantly makes rookie mistakes but sometimes they’re hard to catch.
Another thing I notice is that sometimes it is typing out a long answer and then deletes it all, then says it can’t answer that. I think there is some separate module that is trying to prevent some answer that they want to block for legal reasons.
See, you are doing it wrong, all the cool kids use claude code which actually does all the magic! (Sarcasm, just to be clear). But yeah, context is something that very much remains an issue. Even with models that supposedly have a bigger context window (like Gemini) they tend to lose the plot over a certain amount of tokens.
According to the crowd it means you aren't subdividing your tasks and asks properly. To me, it indicates something I already knew. These are neat tools for isolated questions about isolated units of your software. Which is how I tend to use them for and I don't really need agentic workflows and chains for if I am being honest.
I had this happen with gemini the other day, where on a specific question it kept throwing technical errors.
Yeah, using Claude Code to contribute to open source projects is definitely how I first entered into "omg claude code is amazing, I can just have it grab a GitHub ticket and it just does all the work and even commits and pushes it". But what I quickly found was the opposite. Because it's hard to be able to start from a spec-driven originating point, it's hard to draw up proper Architecture for Claude to go off of. Adding on, that code style is 100x more important in open source repos than enterprise. All documentation needs to be thorough and accurate. And even with large context models, it just can't quite get there yet.
I am going to need you to expand on that before I can genuinely reply to that. Because my initial response is, the hell it isn't.
Hah. Definitely, let me dive deeper. And my opinion is definitely superseded by your own, as you have more experience with OSS. Well, I feel like maintainability is the name of the game with OSS. That's the key factor. So one of the ways for maintainability to remain high, is for code style to be clean and predictable. It allows the contributors and maintainers to more easily approach the codebase. But again, I am so happy to consider alternatives that I haven't thought of because I have certainly not thought about this in depth
It's the same in many enterprise environments. At least the ones I am familiar with where code standards are there and reinforced. But, you are also really contradicting yourself here now. By all accounts you initially said that LLMs will have more trouble with OSS software because code standards are higher. This in itself is already debatable. But if we go with that assumption and the code is also better documented it means that LLMs should have less issue with OSS code as it is more structured. Which would also make it much easier for these models to work with and comprehend.
This feels like the crux of it for me. So much of software engineering is people, and LLMs won't fix communication issues or point you to the subject matter expert. Faster coding is a marginal improvement at best.
I'll only be worried about my job when an LLM can solve a mess like this: https://youtu.be/y8OnoxKotPQ
Just to continue the AI discourse from yesterday, thought I'd bring this article here. I saw this posted on HN today and found it interesting and convincing
The AI will only generate boilerplate if that's all that you're telling it to do. I have greatly increased productivity using pointed, well written prompts that instruct the AI effectively.
Basically, I think it's a little bit of user error. But it definitely doesn't make everyone a super engineer.
I marked this as noise. AI summaries are often enough wrong. Even if they aren't, they remove all nuance and encourage people to just respond to the summary. Which is slightly less bad than people responding to titles, but still not something I feel should be encouraged on Tildes.