Most people, even highly technical people, don't understand anything about AI
This is always weighing on my mind and is coming after this comment I wrote.
The tech sector, especially the hyper-online portion of it, is full of devs who were doing some random shit before and shifted to AI the past few years. Don't get me wrong, I'm one of those: In much the same way, very shortly after the release of ChatGPT, I completely changed my own business as well (and now lead an AI R&D lab). Sure I had plenty of ML/AI experience before, but the sector was completely different and that experience has practically no impact aside from some fundamentals today.
The thing is, LLMs are all in all very new, few people have an active interest into "how it all works", and most of the sector's interest is in the prompting and chaining layers. Imagine network engineering and website design being bagged into the same category of "Internet Worker". Not really useful.
Some reflexions on the state of the business world right now...
In most SMEs, complete ignorance of what is possible beyond a budding interest in AI. Of course, they use ChatGPT and they see their social media posts are easier to write, so they fire some marketing consultants. Some find some of the more involved tools that automate this-and-that, and it usually stops there.
In many large companies: Complete and utter panic. Leaders shoving AI left and right as if it's a binary yes-ai/no-ai to toggle in their product or internal tools, and hitting the yes-ai switch will ensure they survive. Most of these companies are fuuuuuucked. They survive on entropy, and the world has gotten a LOT faster. Survival is going to get much harder for them unless they have a crazy moat. (Bullish on hardware and deeply-embedded knowledge; Bearish on SaaS and blind-spend; Would short Palantir today if I could)
In labs just like mine: I see plenty of knowledgeable people with no idea of how far-reaching the impact of the work is. Super technical AI people get biased by their own knowledge of the flaws and limitations so as to be blind to what is possible.
And in tech entrepreneurship, I see a gap forming between techies who have no respect for "vibe coders" on the grounds that they're not real programmers, and who don't end up using AI and fall massively behind since execution (not code quality) is everything. And at the same time I see vibe coders with zero technical prowess get oversold on the packaging, and who end up building dead shells and are unable to move past the MVP stage of whatever they're building.
And the more capable the tool you're using is, the more the experience can be SO WILDLY DIFFERENT depending on usage and configuration. I've seen Claude Code causing productivity LOSSES as well as creating productivity gains of up to 1000x -- and no, this isn't hearsay, these numbers are coming from my own experience on both ends of the spectrum, with different projects and configurations.
With such massively different experiences possible, and incredibly broad labels, of course the discussion on "AI" is all over the place. Idiocy gets funded on FOMO, products built and shut down within weeks, regulators freaking out and rushing meaningless laws that have no positive impact, it's just an unending mess.
Because it's such a mess I see naysayers who can only see those negatives and who are convinced AI is a bubble just like that "internet fad of the 90s". Or worse, that it has zero positive impact on humanity. I know there's some of those on Tildes - if that's you, hello, you're provably already wrong and I'd be happy to have that discussion.
Oh and meanwhile, Siri still has the braindead cognition of a POTUS sedated with horse tranquilizer. This, not ChatGPT, is the most-immediately-accessible AI in a quarter of the western world's pocket. Apple will probably give up, buy Perplexity, and continue its slow decline. Wonder who'll replace them.
Can you tell us something about what you are doing with your company? What do you make/research? How do you make money?
You make a lot of claims in your post but they are almost all backed by arguments of authority, not actual justifications.
That can be fine, if you actually are such an authority on the subject. So I would love to hear more about it.
FYI: Myself, I am a senior cloud security engineer. I use AI mostly to iterate quickly with POCs for apps or scripts. I believe its a great tool to get some basic things working or to learn faster when starting with something you are inexperienced about. But it also often really sucks and wastes an enormous amount of time making up nonsense for me to falsify.
We're an AI lab operating as a venture studio and yeah this is an opinion piece. (How we make money: Consulting + AI-based spinoffs, for example this chatbot platform).
Cloud engineer I would recommend you look into coding agents for debugging specifically. I've used Claude to find stupidly obscure issues as it can use any CLI tools at its disposal to figure out what's happening. One particular issue I remember had to do with a weird docker+WSL interaction and it figured out by a mix of experimenting with CLI, searching the web and modifying docker config + restarting it. I would have done no different except it solved it in <10min where it would have taken me a full day.
We're actually experimenting with how good coding agents can be at IaC - suspicion is that they can be extremely powerful at exactly this if we set it up right.
Hell two days ago I got it to help debug intermittent wifi issues on an archlinux machine. Fed it dmesg and it knew what to do. Haven't had those issues since...
How do you square this against how one can only rent, not buy, "Claude Code" and similar branded code manipulation services? It seems inadvisable to invest in and adopt workflows based on cost/benefit analyses where the cost side is heavily subsidized by your vendor's VC runway. Are you going to owe that day saved back with interest when you need to adapt your whole work flow to a new checkpoint, or when they start putting the prices up, or when the whole operation is strategically aqui-hired and you need to build an expensive, worse clone in-house?
Never do I hear "all devs should get good at slinging code with two 4090s, TurboLlama 5000, and this Emacs plugin". It's always stuff that can't actually be gotten or controlled, like people forgot why building GCC was important and what it won us.
Right now I don’t square it. I absolutely agree with you and I understand the implications… the thing is, this is such a massive differentiator that there is no catching up at the individual level.
My company actually is working on researching better ways to democratize the access. But we can’t do it puritanically: being practical is how we can actually get results.
Use these incredibly powerful tools to shape and create the future you believe is better for everyone. Even if that future doesn’t include said tools.
I think understanding the technical facets of AI and understanding the potential business applications and products that can be generated with AI are different things. I think when you say “understand anything about AI”, you are emphasizing the latter, but I cannot totally tell.
I think the training, fine-tuning, and alignment areas are also active sectors of research in labs at universities and private labs, especially in Silicon Valley. I think what you notice is less that people lack an “active interest” in the field and more that anything that isn’t prompting/chaining is really hard.
I consider myself pretty capable as a mathematician, but the techniques needed to make real progress in these poorly understood mathematical spaces are beyond my understanding. There’s a reason those AI PhDs have salaries as high as they do. I think people want to get involved, and the prompting space is the only accessible place. Of course, they will call themeselves an AI engineer regardless if only to demand greater pay or raise more money.
I think understand the “impact” of AI is far more societal than technical. Even narrowing our focus to coding, and making the claim that good AI will replace coders, you still need to answer Why do companies hire programmers?. I’ve heard very compelling arguments for why demand for programmers will rise, fall, and stay the same in response to AI developments, because each argument makes different assumptions about what the value-add of a programmer adds to an org.
I’m not sure anyone informed would claim that LLMs have zero positive impact on humanity. The tools are already accelerating research and helping us achieve things marginally faster.
The question is about “net impact”, or more accurately, “projected net impact”, since the discussion largely hinges on What will the companies do when their VC money dries up?. Will they bankrupt and cause a recession? Will they sell their tech to the highest bidder without ethical boundaries? Will the financial incentive to find a use for AI be so great as to compel companies to layoff swaths of employees? Or will people cough up money and everything be fine?
When it comes to these societal questions about the “impact” of AI, I do not think you are wrong. I don’t have strong answers to these questions except that I think many people are taking their assumptions about the world and then believing their conclusions are obvious. I think there remains quite a bit of uncertainty, and so “not understanding anything about AI” seems fairly reasonable to me.
Frankly, you have written a lot without actually providing a lot of information except for the most surface level. You do link to your previous post and I do remember seeing that and the hackernews comments. Which makes me think this post is likely a frustration outlet for the people who completely misunderstand how LLM training and training data works.
Where are the examples? I get that this is probably internal to companies and that you can't go into extreme details, but this just says nothing at all. A 1000x productivity gain is insane, just casually throwing it in here without any substance makes me just press the "doubt" button. In fact, it is these sorts of claims without substance that give food to naysayers. In a previous discussion I mentioned it as well
Can you blame naysayers in the face of company leaders trying to shove AI into everything, while the examples of it working to the claimed degree are rare at best? To be clear, I am not a naysayer, but I am also not completely onboard. I get good, but underwhelming use out of LLMs for the most part.
Sure, I'll give you a practical example.
http://github.com/jleclanche/connections-tui
This entire app was created in around 5 minutes. It took 3 prompts and a couple minutes of testing.
This would have been an entire side-project for me, which would have easily taken at least two full time weeks (or more if kept as side-project) to get to the level it's at now.
The fact it even exists is entirely AI-driven. I would not even have started it if AI wasn't a thing.
So... ~80 hours -> 5min.
I don’t think this really proves your point because by your own admission this wouldn’t exist without AI to do it for you, so the “80 hours” number is purely theoretical. What you’re really saying is that AI allowed you to create in five minutes something that wouldn’t be worth your time otherwise.
Do you have an example of actual productivity gain? When I’m making little toy apps I’m usually doing it to avoid being productive…
Just to illustrate the insanity of 1000x productivity, that means what you’d normally deliver in two years (assuming a forty hour work week) you deliver in four hours with AI.
I'm a pretty big naysayer here, but I see both sides on this. On one hand, it is nice to produce some bit of content that would be impractical otherwise. The value of making quick prototypes to test something out cannot be understated. That's essentially what game engines did for independent game development in many ways. We put up with a lot of cruft because the time save (or what's even possible) offsets those shortcomings.
That's what "move fast and break things" was supposed to be about. To figure out pain points and design shortcomings before you commit. AI seems to be great at that. People don't seem to be great at differentiating "prototypes" from "full product" anymore, though (another issue with the modern games industry).
But I think the pattern I see here is scalability. 3 prompts can make this quick GUI project. But (and feel free to correct me, my LLM knowledge is shallow) the way to approach it to do any professional-level work like:
These all need entirely different approaches to how you prompt if you want success. Because LLM's as of now can't really iterate on their own work with the same prompting used to generate the scaffolding. You need to isolate them to small sections of the code base and put them back in that small space where they thrive. And that's where the engineering mindset comes in handy: the ability to break down problems, to understand the data being worked with, and to understand pros/cons of architectural decisions. That's what separates an engineer utilizing AI from a "vibe coder"; they simply can't scale.
I don't really have faith that anybody "gets this", though, as of now. I just don't get that "vibe" that anyone's really out there wanting to learn how to "really use AI". Only evangelize it. That's the frustrating part for me (Especially in games with generative AI. it's, to be frank, a bloodbath of caustic discourse). Perhaps this is so new that such workflows are kept within companies. Maybe the current times are so bad that people and companies are aware but don't care; they just need something to pay rent, or to please shareholders.
How do you define productivity?
I spent my life (23 years in engineering + various other fields) working on side projects. Some of them have led to startups. I professionalized my ADHD and turned it into a venture studio / AI lab.
I wrote a Hearthstone simulator and the next thing I knew, I was CTO of a video game data analytics startup. This is the stuff I used to do to "avoid being productive".
This little TUI game would not have existed because I would not have bothered, because my off-time is no longer being spent playing world of warcraft but rather dealing with a million other things life is throwing at me. I'm officially too busy to ever have side-projects worth starting.
So instead of spending some of that off time on a passive activity, I was able to spend it on a creative activity and produce something, which /would/ have taken me several weeks had I gotten past the starting step.
I'm flabbergasted people can look at these examples and still not Get It. What is it with that?
A Tesla can accelerate "0 to 60 mph in as little as 2.1 seconds". To illustrate the insanity of that number, that means a Tesla can reach the speed of light in less than a year.
My personal definition of productivity is very broad and would encompass making little toy apps for fun, because doing things you enjoy makes you feel better. But we don't live in that world, and considering that so often I see vibe coding presented as something to speed up one's actual job, I'll use the very reductive definition of progress on task per unit time allocated to task. I procrastinate on little other projects or tangential things a lot and so I end up producing less task progress per unit time than if I wasn't doing that.
I think these illustrate the same point (and you're right about the Tesla analogy), which is of scalability. A Tesla can accelerate at about 12m/s/s up to a limit, and an AI can accelerate programming tasks 10/100/1000x up to a limit. But in the same way I haven't seen a Tesla go the speed of light, I haven't seen an AI allow a developer to produce 1000x more task progress per unit time on an established codebase or within a organisation (and I know people who've tried!), only when it's small, isolated applications like this. If there are some examples I'd love to see them.
What you are showing is a self-contained tech demo of a game. That LLMs are pretty good at one-off, relatively simple applications and games is something I have no doubt about. But I am failing to see how this translates to a 1000x productivity gain?
Looking at the code, even though I haven't done anything with go, it isn't that complex. Not counting the dependency declaration files, we are barely talking 800 lines of code here.
And yes, it likely would also have cost me a fair amount of time to code something at the same level, but after that I'd have a fairly good understanding of go making subsequent projects much faster.
The fact that you now can prototype a game like this in minutes is absolutely amazing. And for applications like a game like this, I'd say there is nothing wrong with that. I do think that claiming that it just cost you 5 minutes seems slightly disingenuous to me. Writing specifications should be part of that calculation, and writing good specifications will take more than just five minutes.
But once you move into more critical applications, it will be more than 5 minutes regardless of specifications. That is, if you are doing your due diligence. You'd need to verify that the code really does what you need it to do. Which at some point means having an understanding of the language it is written in, do code review, etc. In the best case scenario, you have code generated that does everything perfectly the first time. That's rarely the case and where the trouble starts, which just as easily can lead to productivity losses. Most software work is not a zero to one affair, it is a "one to N" type of deal.
Basically, the claim of a "1000x productivity gain"seems to be derived from a scenario that represents a tiny fraction of a professional software engineer's work. Frankly, that is just being cheeky with statistics. Sure, some highly specific tasks can be hugely simplified, but that doesn't translate to such insane overall productivity gains. In fact, I'd say that this example is a prime case of the exact kind of decontextualized hype that creates the messy discourse you were lamenting in the first place.
Again, I am not trying to dismiss the fact that there are some pretty good applications for LLMs, as I hopefully have made clear I make plenty of use of them in my work. It is the messaging that I think is off.
I am counting the writing of the specifications. The prompts were incredibly simple. I’ll show you later, I’m not on my laptop.
I didn’t anywhere claim that it’s a consistent and continuous 1000x that persists in more critical applications; I even said I have seen instances of negative productivity. But it did achieve a task in one thousandth of the time it would have taken me to do it. It’s not the only time that has happened. Claude Code existing at all even enabled me to do this; the entire thing would NOT exist otherwise as it would not be worth the time investment.
Latching on the wrong parts of what I said is the problem here… re read my post and consider what you’re responding to, because I feel like you’re reading something I didn’t say. This is again representative of what I find very often in ai: biases taking over causing people to imagine things that don’t exist. Human hallucinations so to say.
When you write about "productivity gains of up to 1000x" without clarifying it is for specific isolated tasks, it is not that surprising that readers interpret this as a general claim about productivity.
Again, that in some instances single tasks can be sped up through the use of AI is a given as far as I am concerned. But at the same time, it isn't given that this is the consistent result. As you said yourself, sometimes it is detrimental to productivity. In my experience this can happen with the same set of tooling, same engineer and same type of usage. Overall, I personally think it still amounts to a net increase of my productivity, but not nearly as much as 1000x.
Anyway, even if we bring the focus back to single isolated tasks. If with the same tooling the outcome can be so wildly different, then I simply don't see how you can blame people for not understanding AI. Which might be the point you were trying to make, reading this back:
But, to me, it also puts cause and effect in the wrong order. The makers of these tools and models are the ones who are pushing this. Ignoring the fact that the outcome often is a roll of a dice. These companies have unlimited funds in the form of billions of VC funding and are doing everything in their power to sell AI to the masses. So yeah, of course the discussion is all over the place if one side is heavily and I mean heavily influenced by billions upon billions of VC funds and marketing.
The same sort of marketing that also heavily features claims about productivity increases, overall productivity. In your point you might not have intended as such, but that's the landscape we are operating in.
Look, you also replied to specific parts of what I wrote while not responding to others. I am assuming you did read those parts and considered them as context when writing your reply. It is what I did. The productivity claim line just jumped out to me as a specific example of tone and framing.
Multiple commenters are asking for clarification and context. When several readers independently reach similar conclusions about what you wrote, it might be worth considering whether the message itself could be clearer, rather than assuming everyone is hallucinating.
I feel like they did clarify that this productivity gain was for specific tasks. From their original post:
1000x sounds like a lot to me, but then I'd probably need to spend a full week building a project like that. Factoring in sleep time, 5 minutes vs one week of full time dev is actually about 1000x.
The part you quote talks about different projects and configurations. To me the implication, even after clarification, still reads broader than single isolated tasks like a standalone game. If it had said something like "AI can speed up specific greenfield tasks by 1000x" then I would have had no issues with it at all. Because that is absolutely right.
Yeah, that was already clarified by OP as well. Without repeating myself too much, I am not saying that the math is wrong for that one specific, isolated example. But that using one highly specific isolated outlier to make "1000x productivity gain" claims is not productive either, feels very reminiscent of the overall industry hype, and ironically makes it part of the exact messy discourse OP was complaining about.
You're reading a lot in my post that isn't there, such as the fact I'm surprised by any of this. I am not… in fact I pretty explicitly said in my post:
Key word being "of course". Yes, it's not surprising. Again - opinion piece, no blame being assigned anywhere; I'm very matter-of-fact about life.
And yes I only responded to one part of your post because I was on my phone and could only give a few minutes of my time to that response :)
But I do see this .. i think very defensive attitude towards the "1000x" number. It's a scary number because it has a ton of implications in terms of job security, society, everything. I guess it's normal to be defensive over it and want to contradict it by saying "Yeah but it's not 1000x all the time" -- thing is, nobody said it was. What I said is that it's possible to extract this, and doing so creates opportunities that did not exist before. Because it's not like I wrote that little game 1000x faster than my "competition" -- it's that, because I could write it 1000x faster, I just did it, whereas months ago, I never would have done it. Now it exists, where before it never would have.
It's not "work time" that was replaced - it's 5 min watching a chill youtube video or something.
For the records, here were the prompts, verbatim (yup):
and .. that was it. Not much in the way of speccing, was there? :) Yeah, it's fucking insane.
I actually quoted that exact same passage in my earlier comment, so we're in agreement there. As I said in my reply to Wes here, I don't have an issue with the 1000x number, if it had stated that AI can speed up certain greenfield tasks by 1000x I'd have just read it and agreed. So what I am saying isn't so much surprise at the messy discourse. It is about contributing to it unintentionally.
If you go back to my original comment you'll see that there is more to it than me just dismissing a 1000x productivity claim:
Your focus so far has been mostly on defending that second point in a highly specific context, and I get the feeling you've now placed me firmly on the naysayer side of things.
As a final note:
I am honestly not familiar with the connections game of NYT. So did not recognize it. But after looking it up, we are talking about a game that is the second most played game published by the Times, has its own wikipedia page outlining the gameplay mechanics, and you provide it with highly structured JSON data. Which does make it impressive, but also makes this a bit of a unicorn, as it is unusually well documented in public materials. In the context of productivity claims, the one appeal I see for greenfield one-off applications where there currently is no alternative. Meaning, you do have to write the specification and think about it.
Then call me the unicorn factory because I have reproduced this experience several times :)
Here's another product, with even a business model, that was built in the space of around 30 minutes: https://nacebel.codes
Most of that time was spent on deployment actually, since it's one of our earlier AI-generated works. And purchasing the domain while inferencing.
We build scrapers, linters, databases and APIs where before, starting the work was even unthinkable because it would be a fool's errand.
Anyway I hear ya. I haven't categorized or placed you in anything, no; I don't have any interest in labeling people (unless they're in my datasets) -- I'm really just here to share amazement and experience! it's only half-rant, half "message in a bottle", sorry if this was unclear.
POV - I was on Tildes when chatgpt had just come out, showcasing how powerful the language models were, and arguing over the potential of the whole thing. At some point I stopped really arguing on the internet and just ended up focusing on doing meaningful shit with all this tech. We're living in an incredible stepping stone of civilization… we can either get busy doing something with it, or do what the other half of HN does and be a pedant over whether hallucinations should be called confabulations, and whether Real Programmers should never use AI. Too many times in my life did I choose the latter… not this time, heh.
I mean, it is a nice example. I looked at it, and I think it is actually a great illustration of the point I was trying to make about context and framing.
Correct me if I am wrong:
This is exactly the kind of task where AI shines, taking well-structured, well-documented public data and creating a basic CRUD interface. The NACE-BEL system is thoroughly documented, has a rigid structure, and the task essentially boils down to "display this list with search."
It's impressive that it can be built in 30 minutes, absolutely. It would have taken me slightly longer to build that, maybe a few hours to a day. Then again if people are willing to pay up to 72 euro per month I'd think that even spending a day (which is the time investment at most to do it by hand) is already worth it. So I can't say I am in agreement over the fool's errand bit.
This brings me back to my original point: there is a significant difference between "AI enables 1000x productivity on well defined, limited-scope tasks with existing structured data" and the broader productivity claims you often see in this space. That NYT Connections game and this NACE-BEL site are both perfect examples of the former, and they do have value, no denying that! But I think we should be precise about what these wins represent.
And it honestly is great that you felt so empowered by AI, that's not something I want to take away. But to me it is clear that it isn't as much AI, but more you that makes it work here. In fact, I feel like that these more represent your own ability to find these already structured use cases and build on them. As I said, the NACE-BEL website seems simple to the point that, given the prices you list, the economics always would have worked out. So, it isn't as much AI that makes it work, but you. I must say, I am not entirely sure if there is added value besides the smoother search operation, but I am also not familiar with NACE-BEL other than me just looking it up now.
Which also brings me back to the entire discourse and perspective about what AI makes and doesn't make possible.
I think that's fair, I also think this a perfect example of why I think AI impact is also enormously exaggerated across the board. As far as I have been able to ascertain AI does not give you all that much past a certain project size (let's say 10kloc).
We have a few codebases at work, at different sizes. On the smaller projects AI is magic. On the larger projects the general consensus is that prompting takes longer than typing the code. You let Claude rip and it doesn't know about your utils library, it doesn't know about existing dependencies, it will just pull in whatever, it will send itself into loops trying to solve diamond dependencies it created, it will just stub a function and forget about it.
Unironically I run a team of 10x devs (our footprint is less than a 10th of our peers in the industry) and we wanted AI to be a multiplier, and so far it's basically reduced to writing unit test skeletons and doing boring data transformations, and occasionally a bit of code gen (make me bitflags for the enum described on this page, etc). I have seen zero evidence in my own immediate peers or any industry peers working on 100kloc+ code bases (which is not even that big!!!!) of any fantastical real velocity improvements (i.e. features delivered per unit time, not lines of code produced). The whole game is context window management, and it's exhausting.
I’m wondering about the consequences of tiny programs being so much easier to build (and throw away), while large codebases are little affected. Maybe we should be writing more tiny apps? Maybe there is a way to architect things so that more of the functionality is in tiny apps?
So... microservices? Kidding aside, the software industry has been chasing this dream of composable tiny pieces forever. Unix pipes, microservices, serverless, component libraries. The hard part has never been building the small pieces, it's orchestrating them into something coherent. Which brings you back to square one.
Then there is also the fact that you can't reduce the issue to just "make tiny programs". It also highly depends on the sort of apps in both type and subject. The underlying technology also matters a lot. You see a lot of people having a lot of fun making fairly simple web based applications and games. As it so happens, these are two subjects the internet is effectively overflowing with information so it is no surprise that these models do fairly well here.
Which makes for great demo material for the supposed capabilities of these models. Games and web based applications are highly visual, which always has been one of the best way to impress people and get their buy-in for something.
Once you move off the beaten path you are much more likely to encounter hurdles even with simple smaller applications. For a lot of backend related things, you have to be much more careful in your interactions. Put much more thought in the prompts, extra context you provide and be aware that the output more often than not will contain things that are just not quite right.
Now, we could go ahead and just consolidate everything in the Javascript ecosystem. But besides the fact that people have been trying to do that now for over a decade (joke), it also isn't practical.
One of the risks with LLMs and reliance on it is that it atrophies the ecosystem, or rather knowledge about the wide ecosystem for those who rely on them to this degree.
This is an age old question in systems architecture. You can go back 50 years and see people debating microservices versus monolithic code, but there are very real drawbacks to writing lots of tiny apps versus one big one, and they're not trivial. It's the whole reason GNU didn't produce a real OS, and all of their core utilities are only widely known because they're used in Linux.
One way to make the problem smaller will be to have better documentation for internal APIs. Basically the things we should have been spending time on to be kind to ourselves also in my limited experience seems to yield a bit of improvement, but I haven't done enough experimentation to have a firm conviction. I've had a really good time with rust + aider when I can point at some docs and say: use the library documented by this page to do xyz. Our own internal code doesn't have great published docs. We skirt around this through code review and shared theory crafting in meetings. Which doesn't need to scale because we are still a very small team. AI isn't privy to this folklore, it has to hold the code to know what it does, blowing out the context.
Yeah, I see this as a good reason to use AI assistance to improve the documentation.
I think another issue is that you look only at the time taken, but productivity is a ratio of output to input. Output includes both quality and quantity of the thing produced, and input includes labor but also capital, energy used, natural resources, etc.
Some of the stuff produced by generative AI have very low utility (probably even negative in many cases) but use a lot of capital and energy, so productivity can be very low even if it didn't took much time to produce.
I think it's also important to take the macro view on these topics, because at a personal level we have this lazyness bias that makes us love anything that reduce the effort we put in.
Sometimes I'm afraid of sounding like a naysayer because I've also often warned people of the risks involved with LLMs: how easy it is to think you get something when you don't, the overreliance of it, people developing delusions, and even some issues when people (semi-carelessly) implement in places like healthcare. Not too mention some of the ethics involved in the data used to train them.
There is a lot of interesting research in applying it to research too, such as fluid dynamics. I'm actually looking forward to a conference where I know I'll talk to people who actually understand how they work and how the techniques can be applied. Because well
I didn't write the original post but it does apply to me in a way. It's frustrating to see so much time, effort and capital being put in what appears to be a hype machine. Even though for some basic stuff I use it on a daily basis at work, like you, because for the things you don't need it for it can be useful.
I think this is 100% true and I will defend this sentiment.
While I agree that there are positive uses for AI, they are vastly outweighed by my complete and utter lack of faith that the tech companies pushing it will do so in a safe and socially responsible way. Corporate America sees it as a way to improve efficiency, which is code for reducing the need for labor. With every major advancement in worker productivity, the top 1% have claimed the lions share of the value created by that improved productivity. They won't care if we hit 10%, 20%, or 30% unemployment. This is to say nothing of the potential malicious uses of the technology.
Building stuff faster means nothing if the rest of us aren't allowed to prosper from it.
Like social media, this will be a net negative for humanity. The decline of the middle-class will accelerate and it will only get harder for people to get ahead in life. But hey, at least you can have a virtual anime girlfiend.
It’s impossible to argue against a cemented position.
Income inequality has consistently gotten worse and worse since 1980. My position is not set in stone, but I need to see evidence that this trend will actually reverse.
A lot of the "Rich People Money" is entirely make believe, built out of overleveraged nonsensical financial systems.
Elon got investments in order to buy his own company at his own valuation, to justify valuations in a dead acquisition made out of mostly the same shit, and this works because he has /some/ money. It's a modern day ponzi scheme that relies on a mix of financial privacy and charisma to pull off.
Similar levers are available to most people with some side cash saved up if they know it. Can be used for good if you know how to do it. I taught a friend of mine how to turn an extra 30k/yr she didn't know what to do with, into a 1M investment fund for green tech. She worked in fintech for longer than I've been programming and had her mind blown this was possible at all.
Income inequality is not a money thing, it's a society thing. It's accessibility to knowledge: Education with extra bells and whistles.
Lowering barriers of education lowers income inequality. This is something AI achieves.
Only if your argument relies on emotional appeal. You fight cement with steel; if you really show the results and show beyond a shadow of a doubt that this is working, then cement will crumble.
Of course, this is all speculative, so there's no steel as of now. If nothing else, I've seen nothing suggesting the modern technosphere that ruined social media also won't also ruin any benefits AI can bring. It's the same guard after all. Happy to see evidence to the contrary, though.
I have no desire to fight cement and I'd honestly rather keep both my steel and my sanity instead of trying to move people who wouldn't want to budge anyway. If I'm going to waste my breath convincing someone who's dead-set on their position, I'd rather convince an american that trump is not as great as they thought or something - at least that'd achieve some good if I succeed.
But it's not... people are speculating on future while ignoring the present. AI is helping in clear-cut-good areas such as drug discovery (and yes, that includes new-generation GenAI), it's helping students learn languages, it's helping create more free and open source tooling...
And all those chuffing at, for example, AI replacing consultants "as if that has real-world value"? Consultant $$$ consume a ton of tax money on public projects. The cost of creation being brought down means money being used for more useful purposes than lining these same pockets. And while I'm sure the efficiency of the conversion won't be perfect, it'll be a hell of a lot better than before.
Correct me if I'm wrong, but the "AI" innovations in drug discovery, protein structures, and mathematical proofs have come from various machine learning techniques. I have not heard of any substantive discoveries coming from an LLM or any of the current generative AIs.
New algorithms in all sorts of comp sci fields are being called "AI," but LLMs and GenAI seem to be exclusively useful in writing small programs in well documented languages, with a sufficient amount of sample code avaliable (ingested without permission, of course).
I doubt this was your intent, but it just rubs me the wrong way when it's implied that LLMs/GenAI have played any role in drug discovery or other interesting recent comp sci breakthroughs.
People are conducting research into the applicability of generative AI in various scientific fields; those applications just aren't widely known because they're extremely niche.
For example, in my field (lattice QCD) we use massive amounts of computational resources to generate "ensembles", which are essentially simulations of hadrons in a discretized, finite boxes at non-physical values of the quark masses. And when I say massive, I mean truly mind-boggling numbers -- we're talking computation time of the order O(1) - O(100) million core-hours and with disk usage measured in the petabytes.
So you can imagine the amount of money that is spent on these projects. Now comes along diffusion methods. Some people realized that image generation doesn't look that different from ensemble generation, and that if you could simply generate an ensemble via diffusion you could drastically reduce your computational requirements compared to conventional techniques.
An ongoing research problem in my field is determining to what extent these diffusion techniques are viable.
Oh shit, I came here to ramble about scientific ML and the first reply is the same topic I did my master's on! I have nothing meaningful to add but I couldn't resist acknowledging it...
Do you mean lattice QCD specifically?
So this was… oof, around 15 years ago now, but I was working with various bits of existing QCD simulation code looking at compute efficiency, accuracy, and applicability to the LHC results that were just starting to be generated.
Honestly I’m way out of date with the field now - one of the big things I learned was that I’ve got a much better aptitude for the computational part than the physics part, especially porting stuff to those early CUDA builds, so that’s been the common thread in what I’ve worked on since, but it still made me smile seeing someone talk about such similar stuff out in the wild!
Sounds like the wave of neural net development in the last few years has been an interesting one, from what you were saying above. Big shifts in the work you’re doing, or is it still more a question of waiting to see if it actually does the job you need?
Neat!
The developments aren't really relevant to my research, but given that there are now workshops dedicated specifically to applying machine learning to lattice QCD, I would definitely consider it a thriving subfield. And there are some seemingly useful applications, e.g. training neural nets to reproduce costly correlation functions [1].
Regarding ensemble generation in particular, I think it's still an open question as to whether such a program is actually feasible. Unlike image generation, which has a nearly limitless number of images available for training, we are in much shorter supply of independent ensembles: there are only about O(100) - O(1000) ensembles that have ever been created. And even assuming one could generate an ensemble, we would need to be absolutely sure that we had the bias under control, i.e. that we were actually simulating QCD and not some partial facsimile.
That’s really interesting stuff - thanks for sharing!
I'm going to say what I've found myself saying a lot in the last few months: it's a problem of terminology.
I don't think you're wrong per se (sorry @Adys, but I think that framing was unhelpfully harsh), but I do think the two of you are talking about different things under the same fuzzy definition that people are calling "AI".
I'm not aware of any physical science work specifically using LLM chatbots or agents right now - there might well be some early research out there, but I haven't seen anything significant - but scientific models absolutely are using the same technological advancements and infrastructure as LLMs. Transformers, state space models, diffusion models, hell even just the renewed enthusiasm for development on pytorch - some advancements came from the scientific world and worked their way into LLMs, some came from OpenAI and Anthropic and DeepSeek and NVIDIA and found their way back into scientific work. In both cases the acceleration of development in the last few years has been vast, and genuinely represented an enormous leap forward in capability.
So no, we're not asking ChatGPT for protein folding suggestions, but we are building ChatGPT and AlphaFold as part of the same tech tree. Whether or not that's part of the "AI" world comes down to what you're attaching the word AI to, and people are spending a lot of time talking past each other without realising because they're using different meanings. Think of what a person might mean by "internet" (really the web, really just TikTok and Instagram) vs "internet" (fundamental global infrastructure comprised of a decentralised petabit-scale fiber and satellite mesh capable of linking almost any two devices on the planet in milliseconds).
[Edit] Typos
Didn’t mean for the response to sound harsh, sorry. The post said “correct me if I’m wrong” so I reused that same wording :)
And I agree with you to a great extent that it’s a terminology problem. It’s the same problem I allude to in the OP I think.
Yeah you’re wrong- genai is being used actively in dd and molecule discovery.
I get that most people have this idea that ChatGPT came out and now every company that says they “use ai” is simply offering a ChatGPT account to their employees so they can write emails faster. But it’s not the case and the impact is massive.
It helps once you understand that these are not “advanced text autocomplete” but “informed decision engines”.
Here’s a case from Exscientia: https://aws.amazon.com/solutions/case-studies/exscientia-generative-ai/
And here’s the biggest of all, Isomorphic Labs; deep mind people are working on this: https://www.isomorphiclabs.com/
Is an llm going to one shot publish a paper about a new drug it discovered? No. But this isn’t how these things are used.
Drug companies are not being super public about this unfortunately so I don’t have much to offer in terms of details, but being in Belgium I hang out with several of the people working on this — Pfizer in particular is super bullish on genai and they’re actively working on it. You don’t hear about it because it takes a long time to put these processes in place and guarantee + verify the entire chain of reliability.
It’s the same in construction … there’s massive advances being made there but you won’t hear about them until they pass the certifications. And in the mean time of course naysayers will claim they’ll never pass those certifications but highly engineering driven companies wouldn’t be spending tens of millions on this R&D if “someone with zero experience in the sector who can obviously see this won’t work” was worth listening to.
That seems like another way of saying we’re already doomed? I’m not even entirely sure I disagree about that, to be honest, but if technological advancement is just going to be co-opted in a way that makes life worse, the only real answer is to fix the root problem or to stop advancing. And if we don’t have the power to stop people co-opting that progress for harm, we don’t have the power to stop the progress happening either.
So we either find a way to take back control - at which point the tech can be used for good anyway - or we’re toast.
I do not think these problems are fixable without massive government intervention; updated antitrust laws, worker protections, 4-day work week, tax structures that disincentivize the hoarding of wealth among others.
These are all things the current fascist regime is vehemently opposed to. The same regime working overtime to make sure we don't have fair elections next year.
"We can use new tech to solve these problems without the government's help" is a very '00s Silicon Valley take that I used to believe in but not anymore.
So yeah, doomed. I don’t want to be glib, because I share a lot of the same feelings and I don’t really know if I believe things will get better in our lifetimes or if I’m just deluding myself because I can’t fully deal with the likelihood that it won’t.
But it’s not the tech that’s doing that, it’s all the other problems you’ve identified. I guess I just hold on to advancing science, advancing technology, as something meaningful and exciting for its own sake in a world that’s otherwise going to shit.
And for what it’s worth, I didn’t mean that tech of any kind will do (significant) good in the face of oppressive power structures co-opting it to do harm. I also used to believe that and have since had the idealism burned away. I just meant that since the tech isn’t the problem either way, it’s a nice tool to have available to us if we can take back control of the power structures.
(not babypuncher)
We're not doomed, but those of us sharing this sentiment are the products of an environment that is predicated on learning to ignore our shared responsibilities in order to devote those attentional resources to "being productive". We have to relearn to respect the agonizing pace of collaboration, because this "the rational system will organize the details!" just keeps getting co-opted to valorize the people designing the inputs to that system. No, we will not be saved, we actually need to learn about each other, be less defensive of our own flaws, and willing to adjust our goals based on new information. As individuals, and as groups.
Technology is, will be, and always has been, a method for achieving solutions. Not designing them. Ultimately, even a fully agentic superhuman AGI hooked up to the internet does not define how we will best live our lives. Capital doesn't either. We do, and not accepting this responsibility is how we got here.
I think we're heading towards ruin with current trajectory, but it's not too late to save ourselves. There's are several factors that can happen that changes this trajectory.
But as of now, I see no practical silver bullet, no superhero swooping in to save the day. Any solutions reversing this ruin will have a significant blood cost, in some way. It's not going to be a peaceful resolution.
It sounds like you have skin in the game on the side AI success. That isn't a bad thing but it is a contextual lens for your understanding of the technology. I think you have accurately accessed a generally dicotomous distribution of opinion on AI. My opinion has shifted from antagonistics towards ambivalent as I've began to understand the opportunities of agentic workflows*.
But I do not work in a tech company, I work in upper management position at a small margin large manufacturing company subject to high variance cyclical commodity markets. The name of the game is cost and I am agnostic towards the toolset. So while productivity gains are critical to business perfomance, I am also extremely attuned to individual competency.
My team is allowed to use AI but I don't want to hear about it; there is no incentive to use one tool over another. Work is evaluated on the final product. Occasionally an issue comes to my attention and I dig down into it, talk to the person who created the output and I'll hear something like "oh well I used an AI for that." The AI doesn't know our business context, it doesn't know the personalities and relationships in the office. My team does and part of their job is to navigate that (or call in support from me, which is very common) to get our work done.
My apprehension around AI is that this little busy work of drafting email or writing boilerplate is valuable because it keeps us sharp. Like an muscle, communication requires practice. Writing and debugging code requires practice. I don't care if it takes a little bit longer to have a person do it because even if you've done it a 100 times, it is still practice. If we develop an agent to do something, that process must have an owner who is as accountable to the result as if a person did the work. Because otherwise we'll end up in a culture where there is no ownership or responsibility and that is anathema to a sustainable company.
* pending acualization
Hello. I don't think you're being a bit extreme in your interpretation here. Bubbles do burst; they don't mean the entire industry dies out entirely. the dotcom bubble burst (I don't think there's much debate about that these days) and many companies died. Arguably some of the old guard also waned to new companies rising from the ashes. It didn't "kill the internet", though.
And we've seen smaller bubbles with mobile, cloud, big data, crypto, and NFT's (of whom's bubble is popping in real time. If not, already popped). No one is saying apps are dead, cloud is dead, data science is dead. How many times do I have to see the same pattern repeat over and over before I point it out and am not dismissed as "provably wrong" by the same crowd who were in those bubbles? A few experienced engineers get gains and that disproves the fact that multiple billion dollar grifts were using this to fake progress? That millions of artists are having their work stolen? What am I provably wrong about?
I think the same will happen with AI. And I haven't seen much to counteract that notion. Many companies trying to coast on AI will die, some of the large companies will falter, and maybe even some heavy hitters will fall out of relevance. That doesn't mean AI as a whole dies; It just doesn't mean saying "AI AI AI " in your pitch will get you 1 billion dollars anymore.
I see it as a good thing; it means all the graftfrs leave, those who didn't prepare will fall, and then maybe we can have a proper conversation on what AI is actually used for instead of pretending that 2025 AI can replace all of humanity. But until then, it's just a mess. Not a mess worth touching.
It seems like one of the bigger gaps has to do with education. There seem to be people who know how to use these tools productively, perhaps through trial and error. That's great, but how does that knowledge spread? How can beginners get up to speed?
If I weren't retired, I'd be looking for ways to pair-program with people who know what they're doing. Hopefully we'll see some quality tutorials.
On the big tech scale, that's another main issue. No one is "being trained" period. They don't want to take the time to train personell on anything. They don't want personell to begin with in the end. No one wants to establish a framework on when to use or not use AI (if there were, they were probably laid off anyway).
You can see this with the job market. New grads are having a brutal time right now. They don't want to train up the next generation. It feels like there's this sentiment of abdadnoning ship, and they want to pillage as much as possible before hoping to be saved by their robot overlords.
This, by the way, is the #1 problem in AI and tech today - I am dead sure of this. I try to keep juniors in my team but I have literally no idea how to train them anymore. Where do you even start to learn the skills my fellow senior engineers learned over years of experience, when the path is basically dead and littered with bad examples?
Solving this is going to be serious business.
It's tough and I definitely don't have a good example either. I was thrown onto a project out of college in 2018 and it was pretty much trial by fire. Some bug fixes and then some months later I was just told to make feature X. My lead kept me on track and kept me from wandering down too many rabbit holes, but not much "training" outside of school classes (which definitely help, but still doesbn't truly prepare you for legacy code).
We definitely need some proper apprenticeship programs. Ways to ensure they get hands on experience in an environment where they are focused on learning, not on "providing business value". But no one seems to want to foot the bill anymore. small business can't afford it and big ones simply refuse to (especially in a way where they don't have their trainees sniped, because a promotion from apprentice to junior should indeed pay more).
Maybe some sort of government tax credit can be given for anyone hired under an apprentice role for up to 2 years that can encourage non-large businesses to take the time to train. But that sort of initiative definitely won't happen in this administration.
That's a good point, workflows are still in the process of crystallizing so the way to learn is more informal (i.e. finding someone to show you how they work, trial and error). While that's interesting to the people willing to explore these tools, the skeptics will stay in the "AI tools are useless" camp without a clear path.
Reading up on how to program with AI tools, discussions mostly boil down to different comments saying "I've had success with this model doing architecture, that model doing debugging" using one of a few VSCode-based IDEs. That's not particularly accessible.
Software itself hasn’t changed, just because an Ai writes it faster. Vibe coders might not know how to build an application that’s safe, or they build themselves a tool that is doing what they initially wanted to do, without any consideration for maintainability, extensibility
Sure. And in entrepreneurship specifically, I've seen a plethora of startups die because they spent their time caring about code quality and maintainability before at all considering whether they even have an audience. Many engineers skip the validation step.
It's a mixed bag of course because these are not unimportant things either, and especially the foundations of an app can cause coding agent output quality to degrade extremely rapidly.
I bristled at the “execution (not code quality) is everything” line because I hate that it’s true, and I think it encapsulates a lot of the shitty incentives we’re dealing with in late stage capitalism. But it’s absolutely, unequivocally true if your primary goal is maximising commercial success.
And if it's not? What if I want to make enough to keep the business going, the employees happy, and overall contribute to society? Where's the balance on "Break everything" and "make sure everything is tied in a bow"?
If it's not, then you and I find ourselves in a very similar boat - and right now I'm about a third of a career deep into feeling out that balance whilst trying to avoid getting steamrolled by soulless profiteers. I wouldn't say I've truly found it, but I haven't failed yet either.
Fair enough. I'm just under a quarter myself. I do see myself starting to go down this route in ~5-7 years, so I'm trying to prepare and balance all options. My goal isn't even to make money per se, but I also know that the bar for "can survive" is sky high for a Californian.
But it's not just about capitalism. If you're working on something, unless you're doing it for yourself, unless you've validated there is an audience for it then you might well be wasting your time. It's not shitty incentives, it's just the question: Who are you doing "this" for?
It's like starting to cook a party-sized banquet meal and well after it's ready, finding out nobody's coming to the event because you didn't announce it anywhere.
Agreed, it’s not just about capitalism, and it’s not all negative, but capitalism (and particularly the blind, 2000s-era capitalism that sees financial return as the absolute only metric of importance) incentivises a lot of the parts that bother me.
You’re talking about the good side: fulfilling a need for people. And I agree, a lot of devs - myself included, sometimes - miss the forest for the trees, focus on tiny technical features because they’re interesting more than they are useful, and generally don’t consider the user as we should.
But commercial incentives often don’t favour building a quality product for the user. The favour building a cheap product quickly, with bugs and security flaws and spyware and a horrible stressful experience for the developers working on it.
Product market fit is good and is important, but the same mechanisms we have to incentivise that are also incentivising a sweatshop-quality race to the bottom for the users and the workers, and this applies to digital products just as much as physical. I’d prefer we could find a balance where craftsmanship in all its forms were appreciated while still encouraging utility.
Who will replace them?
Is it Google?
I am using Claude a lot at work.
As others have already said, it's really good at small standalone programs and tools. I've been able to automate a bunch of small pain points that were previously not worth addressing.
But it's often a slog to use it in a large codebase and can get confused easily. I have it document it's plans and actions and will constantly use those documents to refresh it's context and even still it's constantly getting things wrong in large refactors. It speeds you up, but also slows you down in a way.
Recently I used it to to implement what is essentially Android's MediaRecorder from MediaCodec. Rather than spend a day reading all the docs I started with Claude and I think I ended up spending more time getting it working than I would have if I just "raw dogged" it because I wasn't familiar with some of the internals and assumptions it was making.
Unfortunately, I think because it's so good at one or two shotting small tools you get management writing their own reporting tools and going "well if I can do this in a day and I don't even know how to code why can't you do that in a week".
Yeah it's definitely a learning curve to get it to behave well in large codebases but it's possible! I'm actually going to be speaking at a couple of meetups and webinars on that exact subject. Basically: guardrails (in the form of strict type checking and linters), indexed and well-structured documentation, extreme consistency, strong abstraction layers. All things which overall improve the health of a codebase. All things which Claude can generate itself :)
I have some smart folks working for me.
They know nothing about LLMs beyond what most of us know.
What they have achieved blows me away.
I merely told them to solve real problems and solve them with AI first.
Sure wish you could tell that to all the large companies laying off hundreds of thousands pretending the AI itself is doing the heavy lifting. That's my primary problem.
Same here. I'd love to hear more!
I based my second paragraph here on what my guys have achieved so far using just pure Gen AI, no Agentic/Workflow (yet.)