27 votes

Contra Ptacek's terrible article on AI

18 comments

  1. [7]
    skybrian
    (edited )
    Link
    It’s a fast-moving field and new tools are being released a lot faster than they are being reviewed. People are trying lots of different things and the results don’t generalize. Why is it...

    Ptacek's begins with this throat-clearing:

    “First, we need to get on the same page. If you were trying and failing to use an LLM for code 6 months ago, you’re not doing what most serious LLM-assisted coders are doing.”

    We've just started, and I am going to ask everyone to immediately stop. Is this not suspicious? All experience prior to six months ago is now invalid? Does it not reek of “no, no, you're doing Scrum wrong”? Many people are doing Scrum wrong. The problem is that it is still trash, albeit less trash, even when you do it right.

    It’s a fast-moving field and new tools are being released a lot faster than they are being reviewed. People are trying lots of different things and the results don’t generalize. Why is it unreasonable to say things have changed?

    Quite frankly, it’s too much to keep up with. I think it’s totally fine to not know what’s going on and let other people be the guinea pigs. We can wait for some kind of consensus about which approaches work well. One person saying they got a good experience isn’t enough to be much of a signal.

    So skepticism is warranted, but I think we shouldn’t be confident about broad conclusions.

    (Meanwhile, these rude, overconfident posts may get social media attention but they’re not going to settle anything.)

    16 votes
    1. [6]
      smores
      Link Parent
      This is addressed in the very next sentence, no? The premise of the original post is "it is unreasonable to be skeptical." The premise of this post is "there are still good reasons to be...

      This is addressed in the very next sentence, no?

      It is, of course, entirely possible that the advances in a rapid developing field have been so extreme that it turns out that skepticism was correct six months ago, but is now incorrect.

      But then why did people sound exactly the same six months ago? Where is the little voice in your head that should be self-suspicious? It has been weeks and months and years of people breathlessly extolling the virtues of these new workflows. Were those people nuts six months ago?

      The premise of the original post is "it is unreasonable to be skeptical." The premise of this post is "there are still good reasons to be skeptical." I feel like in context, that is pretty clear and appropriately hedged.

      21 votes
      1. [5]
        skybrian
        Link Parent
        That quote starts with a good hedge, but then immediately implies that the people saying this time it’s different must be credulous idiots. And I think that goes beyond saying it’s okay to be...

        That quote starts with a good hedge, but then immediately implies that the people saying this time it’s different must be credulous idiots. And I think that goes beyond saying it’s okay to be skeptical?

        Maybe it’s because lots of people are trying out new AI tools out all the time, they are improving, and sometimes people have positive experiences that they share. Also, some people have negative experiences. But either way, they rarely write it up in much detail, so we’re left with unverifiable anecdotes about vibes.

        After years of hype, more hype isn’t much of a signal either way. It’s like yet another testimonial by someone who lost weight. Maybe it’s just a fad, but then there are things like Ozempic, and until we get good studies it’s going to be hard to tell the difference.

        6 votes
        1. [4]
          smores
          Link Parent
          Maybe I'm reading this post with my own bias and that's coloring how I'm interpreting it, but that doesn't line up with what I'm actually seeing in the article. That section appears to me to be...

          Maybe I'm reading this post with my own bias and that's coloring how I'm interpreting it, but that doesn't line up with what I'm actually seeing in the article. That section appears to me to be defending the folks that Ptackek is calling “nuts”:

          It has been weeks and months and years of people breathlessly extolling the virtues of these new workflows. Were those people nuts six months ago? Are they not nuts now simply because an overhyped product they loved is less overhyped now?

          It seems like maybe you're interpretting this as Ludic stating that it's nuts to think that things are improving, and I'm interpretting it as Ludic stating that it's not nuts to think that those improvements are still moderate and not necessarily meaningful in many coding situations.

          Ludic does seem to suggest that some people here are being credulous idiots, but that seems quite limited to:

          1. Tech execs that are making outrageous moves like firing a huge number of their engineers, refusing to approve projects that don't utilize AI, or mandating LLM use by software developers
          2. Developers that insist that using an LLM-assisted workflow is “rocket fuel” and is putting them leagues ahead of the plebians still writing code without AI assistance.

          Ptacek (the subject of this article) isn't saying “it's different this time”, he's saying “you're dumb and wrong for not deeply incorporating this specific toolkit into your workflow”. That's the person that Ludic is directing this ire at, not just every developer that is finding utility in LLMs.

          And like… I dunno, that seems fine and appropriate to me. He cites like half a dozen reasonable takes on ways that LLMs can support software development, calls them cool, quotes experienced developers that use LLMs to improve their productivity. It just seems like the article, overall, takes a measured, reasonable stance on the current value of LLMs in software development.

          17 votes
          1. [3]
            skybrian
            Link Parent
            Hmm, maybe it's a reasonable take disguised as an unreasonable one because it's expressed rather rudely. (And Ptacek's article was rather rude too, in spots.)

            Hmm, maybe it's a reasonable take disguised as an unreasonable one because it's expressed rather rudely. (And Ptacek's article was rather rude too, in spots.)

            4 votes
            1. [2]
              whbboyd
              Link Parent
              I mean, Nik is the guy who wrote the infamous piledriver rant (almost exactly a year ago, interestingly), so it's sort of his schtick.

              I mean, Nik is the guy who wrote the infamous piledriver rant (almost exactly a year ago, interestingly), so it's sort of his schtick.

              8 votes
              1. skybrian
                Link Parent
                Okay, less of that, please.

                Okay, less of that, please.

                5 votes
  2. DynamoSunshirt
    Link
    I read Ptacek's article a few days ago, and it really bothered me. This article is a (well-written) rebuttal that explores some of the fallacies in the original article. If you're feeling bad...

    I read Ptacek's article a few days ago, and it really bothered me. This article is a (well-written) rebuttal that explores some of the fallacies in the original article. If you're feeling bad about 'not taking full advantage of genAI' these days (even if you deep down don't want or need to, the societal pressure is insane in some fields right now), this might make you feel better. And give you a chuckle.

    15 votes
  3. SloMoMonday
    Link
    Some important context with the fact that I've not written production code in 6 years, what I did work on for the preceding 10 years was very niche programs and then industrial software. And I am...
    • Exemplary

    Some important context with the fact that I've not written production code in 6 years, what I did work on for the preceding 10 years was very niche programs and then industrial software. And I am now having very little luck setting up Coding Agents to test myself so its an incomplete perspective there.

    Patcek's original essay read like a contractors rationale for adopting LLM tools. Because as a contract dev, all you're doing is closing your tickets. Feature complete, tests are green, approved push to main and jump to the next one. It's the type of people who will probably find the most success in the current landscape because that is what it feels like the most lucrative employers want. Their devs to all be gig workers. The best feature/cost ratio for the shortest amount of time.

    It was comical when Musk was grading twitter employees by how many lines of code they pushed or their criticality to core service while chainsawing everyone else. But that's every employer today. Leadership that is so far removed from ops that they see LLM's as an opportunity to bring on the lowest cost replacements while maintaining their system. While I can't fully understand how, the strongman that Ptacek and other supporters put forward is: (and I'm going to over quote to maintain context):

    LLMs can write a large fraction of all the tedious code you’ll ever need to write. And most code on most projects is tedious. LLMs drastically reduce the number of things you’ll ever need to Google. They look things up themselves. Most importantly, they don’t get tired; they’re immune to inertia.

    Think of anything you wanted to build but didn’t. You tried to home in on some first steps. If you’d been in the limerent phase of a new programming language, you’d have started writing. But you weren’t, so you put it off, for a day, a year, or your whole career.

    I can feel my blood pressure rising thinking of all the bookkeeping and Googling and dependency drama of a new project. An LLM can be instructed to just figure all that shit out. Often, it will drop you precisely at that golden moment where shit almost works, and development means tweaking code and immediately seeing things work better. That dopamine hit is why I code.

    There’s a downside. Sometimes, gnarly stuff needs doing. But you don’t wanna do it. So you refactor unit tests, soothing yourself with the lie that you’re doing real work. But an LLM can be told to go refactor all your unit tests. An agent can occupy itself for hours putzing with your tests in a VM and come back later with a PR. If you listen to me, you’ll know that. You’ll feel worse yak-shaving. You’ll end up doing… real work

    All that's not to say the response essay is wrong or ill informed. It is the type of strategic considerations that should be front and center for any executive looking at and promising new technology. But we live in loopy land where new grads are concerned with stability and performance while leadership wants to just get the job done so they can clock out early. The fact that this tech is being paraded as GAI and peddled to critical systems just reeks. It is a tool that is designed to tick the right boxes to make you feel confident in shipping a bad product.

    A "feature" of the system is that will past tests that it refactored itself. In a system prone to hallucinations that is managed by an "agent" subject to hallucinations and administered by developers that did not have a hand in the product beyond vague imprecise prompts. The same devs that would be subject to reduced headcount but increased workload because of this incredible tool. On teams with no institutional knowledge beyond entry level because that same LLM justifies the revolving door of employees. That isn't a red flag. That's the sky turning red because its raining blood. It's the growing domino meme with the largest set to crush an orphanage. And when (not if) something truly catastrophic happens: "How could this have happened? All out tests were green? Thank god this system wasn't designed by flawed and expensive humans that would be culpable in negligence so they ran through multiple layers of testing and vetted by the end user to ensure acceptable quality; otherwise this tragedy would have been so much worse."

    Its the same sort of discomfort I had when Agile was pushed as the ultimate form of software development. And resulted in todays hot mess of "barely good enough" software that only seems to get worse. Call it an old fashioned take, but I was on projects where we would fight to be on the crisis desk because it was easy money. We shipped with confidence and the only real threat was integration errors or very situational bugs. And yes, it was expensive and drawn out and involved a lot of very specialized skills and experience to deliver properly. You got out what you put in and I'm forever grateful that our clients and employer were happy (ie. forced to) to put in the best they could.

    So if cheap good vibes is all the assurance a FANG and other major companies need, then they can get what they pay for. If any grad or developer wants to develop those skills to grab that bag (no shame or judgement if you do, its tough out there), go for it. You don't have to agree with the sentiment, but Ptacek's essay is what employers want to hear you parrot. It will likely boost your prospects or buy you more time in your current role in places that are all in on AI.

    Just remember that hyper specializing in one tool, ties you too it. And here its not just the technique or process, there is a significant risk in relying on a specific tool-set. Just ask any creative what they think of Adobe. Not sure what the difference is between models and agents but it seems like the only feasible way forward is through big web services. When this AI industry inevitably consolidates to a few big players and the cost of industry-standard tools is prohibitive to individuals or SME's; you are going to employer dependent. So be sure to have enough in you back pocket to sustain yourself. Even outside of the tech industry.

    And also, Patcek's essay harps on about not caring for the future and operating in the now. That is objectively bad advice. The writing is on the wall and you'd do well to prepare for the worst today, next week and in ten years. AI might be the biggest trend we've ever seen but it helps to consider other "biggest trend of all time". Look into how massive cloud migration projects or corporate early adopters turned out. And not just the successes. Common moral of the story, sales spew a lot of shit. Someone has to clean it up. Hype is sales and there is a lot of AI hype.

    9 votes
  4. [2]
    chundissimo
    Link
    Did not expect a hilarious (and well-deserved) jab at Wind and Truth in this article. I’m in the belly of the AI beast geographically and I often feel like I’m losing my mind talking to AI...

    Did not expect a hilarious (and well-deserved) jab at Wind and Truth in this article.

    I’m in the belly of the AI beast geographically and I often feel like I’m losing my mind talking to AI sycophants; this post was pretty cathartic for me.

    7 votes
    1. Aerrol
      Link Parent
      Lmao I actually didn't mind the dialogue in Wind and Truth almost at all. It was just paced really badly.

      Lmao I actually didn't mind the dialogue in Wind and Truth almost at all. It was just paced really badly.

      1 vote
  5. V17
    Link
    Honestly the whole discourse is just getting tiresome. I think that Ptacek is closer to the truth than this guy and I find his writing to be slightly less annoying, but, like, whatever. Meh. I do...

    Honestly the whole discourse is just getting tiresome. I think that Ptacek is closer to the truth than this guy and I find his writing to be slightly less annoying, but, like, whatever. Meh.

    I do find it funny that Ptacek is for some not exactly explained reason giving special treatment to artists, which this response accurately describes but then shows that this author has no idea about the state of the art models for image generation either and on top of that makes the usual mistake of "since people with no taste also use image generation, that means the whole tech si bad".

    As some one who widely uses both ChatGPT for coding and various image generation models for both fun and some commercial projects here and there, I've gradually shifted towards an attitude of "I get immense help from AI assistance and the more people hate it and refuse it and therefore do not get the same increase in productivity, the better for me". So please keep hating, I promise I won't try to discourage you.

    If you do want to try assisted coding, I haven't used agents yet, but reasoning models that can also just google stuff (currently o3) have been immensely useful for me. I'm a junior, I'm on a much lower skill level than the author, so I have to be careful to not use them in a way that would stop me from learning and improving my skills. So although I do sometimes let it do the boring stuff, the prime use for me is them helping me understand stuff I don't. Explaining a math paper in an area I know nothing about for example. Doing code review to catch the most obvious mistakes and bad habits, getting a customized introduction into a new technology, stuff like that. Immensely useful.

    7 votes
  6. Aerrol
    Link
    While some of the acerbic tone moves past humour to eye roll for me, I agree with more of it. Especially the bit about how no one talks about how LLMs can be a MASSIVE time sink unexpectedly. I...

    While some of the acerbic tone moves past humour to eye roll for me, I agree with more of it. Especially the bit about how no one talks about how LLMs can be a MASSIVE time sink unexpectedly. I have posted before about how I think they're going to replace Jr lawyers and students, but I have also found my self wasting hours as my LLM went wonky and I have no friggin idea why and therefore get obsessed with figuring it out or at least getting a better result.

    I really think more needs to be said, especially among pro ai adoption presentations, on how often this happens and when you should just stop. (no sorry best we can do is another all caps post about how AI will make you a superhuman for only $24.99/month)

    6 votes
  7. TonesTones
    Link
    Generative artificial intelligence has really transformed into a political topic, in a way that lots of other technical discussion hasn’t. There’s a reason politics (at least in the American...

    Generative artificial intelligence has really transformed into a political topic, in a way that lots of other technical discussion hasn’t.

    There’s a reason politics (at least in the American discourse) gets siloed away without much productive back-and-forth conversation. When people have strong emotional attachment to their points of view, discussion becomes really exhausting.

    I’m a little surprised that I didn’t notice this exhaustion until this piece. I suspect it’s because of the introduction, where the author belabors their feeling that another AI-focused blog post is necessary. It’s clear they are writing for emotional catharsis more than anything else, and that makes the piece less pleasant to read.

    I don’t totally avoid political discourse, but I tend to keep reservations about saying what I actually think just because disagreements are so hostile. Seems like artificial intelligence might enter that category for me. That disappoints me.

    6 votes
  8. [3]
    krellor
    Link
    Humorously, I feel that the truth is somewhere between the two perspectives of the articles in question. Can we get an LLM to give us an averaged perspective? When it comes to code, I do tend to...

    Humorously, I feel that the truth is somewhere between the two perspectives of the articles in question. Can we get an LLM to give us an averaged perspective?

    When it comes to code, I do tend to agree that the proof is in the pudding one way or another, and as the dev you need to taste the pudding. If you can get an LLM assisted toolkit to make code that is of high enough quality to review and commit, then great. If you can't for reasons related to environment, language, or problem domain, then that is a great reason to not use them. I do tend to think that most devs are bad at reading code, at least based on years of seeing knee jerk reactions while running code reviews. And if you are bad at reading code, then LLM tools will always feel clumsy.

    Honestly, assessing the value of LLM generated code seems like a fairly simple problem given that you can empirically measure the quality of the output via tests and reviews. In contrast, in the healthcare and research space where I'm currently situated we're trying to navigate the integration of generative AI into FDA regulated devices, which feels... murkier. When trained models are ingesting data straight from devices implanted in wounds and spitting out recommendations to the clinician, It makes the fighting over whether LLM code is good or not seem a little over wrought.

    4 votes
    1. [2]
      Chiasmic
      Link Parent
      Can you expand more on what you think in the healthcare space and why it felt murkier/over wrought?

      Can you expand more on what you think in the healthcare space and why it felt murkier/over wrought?

      1. krellor
        Link Parent
        Happy to! The overwrought comment was tongue in cheek and directed at the authors of the articles, because really it boils down to whether a tool works for your application or not. In that sense,...

        Happy to!

        The overwrought comment was tongue in cheek and directed at the authors of the articles, because really it boils down to whether a tool works for your application or not. In that sense, it isn't that different in how you evaluate it compared to many other technologies that have changed how we work, it just has the potential to run a little bit further in the harness. But you can see and test the output of it, and make a fairly easy decision about whether it is good code or not.

        In contrast to the medical space, integrating generative AI into medical devices is murkier because the clinician can't see and assess the output of the model in quite the same way. One example I'm working with is a cloud-connected device that is surgically attached to serious wounds and can perform its own assays paired with micro blood draws that are fed into the model in the cloud, and it generates recommendations regarding wound care using dozens of biomarkers. It might recommend to the clinician when a certain surgical intervention is required, etc. But the clinician can't really see into the model. Only some models can have their encodings extracted in a way that provides any of the "why" to the clinician, and unlike a chat LLM, you can't just ask it to elaborate, and even if you could, its just one more thing you would ask the clinician to vet.

        So you get these devices making specific medical recommendations including surgical interventions, and you need to make sure the clinician isn't too skeptical or too accepting, that the model satisfies FDA requirements, that a randomized controlled trial shows benefit while accounting for all sorts of human elements, etc. Compared to determining if an LLM made good code, it's a harder, murkier thing to release into the wild.

        So I get a chuckle out of watching devs sling mud about it, when I feel like I'm over in this completely different world trying to keep well intentioned but oftentimes naive medical researchers from killing people while navigating some lightly trodden paths for getting regulatory approvals in this space. It doesn't help that I've been named the AI expert in this space for certain federal agencies and IRB's, so I get a lot of hard situations sent my way with my answers having the potential to harm.

        6 votes
  9. hobbes64
    Link
    Well there sure is a lot of hyperbole going on with AI. I think a lot of it has to do with our cultural expectations about what robots and artificial intelligence can do. The more you've been...

    Well there sure is a lot of hyperbole going on with AI. I think a lot of it has to do with our cultural expectations about what robots and artificial intelligence can do. The more you've been exposed to Sci-fi, the more you think about robot servants in an Asimov book, or Captain Kirk telling a puzzle to a robot that makes it confused and start smoking, or The Terminator killing us all, or some cute Star Wars droids that seem to have souls.
    I'm sure that the people selling AI are taking advantage of this confusion, and our human response of projecting our own personalities and fears on anything since Eliza in 1964 that has a vague ability to answer a question.
    I haven't used ChatGPT a lot, but I've used Copilot quite a bit with programming. It's kind of fun and kind of frustrating and it won't be replacing me anytime soon. It makes me more productive sometimes, and wastes my time sometimes. It's usually a bit better than Stack Overflow but sometimes is way less useful. My company seems mostly responsible about it. They are paying for some licenses and inviting people to use it. That seems perfect. But also there are a few high profile projects that are using it without a clear goal. That's annoying and I think dangerous.

    My main fear is the disruption caused by the people in charge who don't understand it at all.

    2 votes