21 votes

Is GenAI’s impact on productivity overblown?

14 comments

  1. [5]
    chundissimo
    Link
    Bias warning: I’m very skeptical of today’s approach to LLMs as a pathway to AI. I think AI will have a massive impact on productivity (and more broadly the world), but it’s been overhyped...

    Bias warning: I’m very skeptical of today’s approach to LLMs as a pathway to AI.

    I think AI will have a massive impact on productivity (and more broadly the world), but it’s been overhyped prematurely due to interesting (but not particularly useful) leaps made by transformer based models and a glut of VC funds. It feels like a lot of more business oriented folks have been persuaded by its impressive outputs without fully internalizing its massive downsides (chiefly its lack of factual reasoning abilities, data privacy concerns, copyright concerns, etc).

    It’s neat to see examples where people have improved their workflow with LLMs, but they’re often pretty contrived examples and I personally would never trust LLMs as part of my workflow even if my company allowed it.

    As such, we suggest that organizations need to take a nuanced, data-driven approach to adopting LLM

    Color me skeptical, but I’ve seen most enterprise “data-driven approaches” fall victim to Goodharts law. I’m not saying to fly blind but numbers can be twisted, and even when they’re not they often fail to capture intangibles e.g. motivation like the article mentions.

    22 votes
    1. [4]
      VoidSage
      Link Parent
      I agree - however, I do use GitHub copilot for coding regularly and I find it saves me a fairly significant amount of time when learning about code I'm not familiar with or on the trivial stuff...

      I agree - however, I do use GitHub copilot for coding regularly and I find it saves me a fairly significant amount of time when learning about code I'm not familiar with or on the trivial stuff that would take 5 or 10 minutes a couple times a day to do by hand.

      Additionally I find that LLM based suggestions when I'm typing something that isn't code (ex: professional email, summary of something, etc) are extremely helpful and save me some time.

      --

      I think that LLMs are an amazing advancement for specific use cases, but I agree with your skepticism. They are not the path to gen ai, but they may be a piece of the puzzle.

      Specific use cases I was thinking of: extracting and summarizing useful data from a large amount of text, generating text based suggestions, (maybe?) translation

      4 votes
      1. BashCrandiboot
        Link Parent
        I don't know shit about code and I used ChatGPT to write a shit load of functional HTML for my website. It helped me make things from scratch that I would have had to otherwise purchase a plugin...

        I don't know shit about code and I used ChatGPT to write a shit load of functional HTML for my website. It helped me make things from scratch that I would have had to otherwise purchase a plugin for or pay someone else to do.

        Sure, I had to be somewhat knowledgeable enough to know what to ask for, but it did all the work for me.

        7 votes
      2. [2]
        ComicSans72
        Link Parent
        I've found the "email" usecase the more useless of those two. "Use the llm to generate the content, then throw it away and write something right" has been my usual result. The more recent one...

        I've found the "email" usecase the more useless of those two. "Use the llm to generate the content, then throw it away and write something right" has been my usual result. The more recent one sentence autocomplete has been better.

        3 votes
        1. VoidSage
          Link Parent
          Yeah, my workflow has historically been to write the email and then agonize over phrasing, rewrite it some, etc. Now I just wrote the content I want, feed it to an LLM with a prompt about the...

          Yeah, my workflow has historically been to write the email and then agonize over phrasing, rewrite it some, etc. Now I just wrote the content I want, feed it to an LLM with a prompt about the context of the email, and then touch it up after to make sure it's still using the tone I originally intended.

          2 votes
  2. [5]
    ButteredToast
    (edited )
    Link
    In my experience, ChatGPT has been somewhat helpful in quickly surfacing documentation for use in software development, but is unhelpful often enough (offering generic advice or hallucinating...

    In my experience, ChatGPT has been somewhat helpful in quickly surfacing documentation for use in software development, but is unhelpful often enough (offering generic advice or hallucinating APIs) that it’s only a moderate productivity improvement and certainly nothing like a step change.

    I don’t trust LLMs to do IDE autocomplete because that makes it too easy for basic errors that slip past a quick sniff test and compile ok to slip into the codebase, so I don’t use it there.

    Ignoring the ethical concerns associated with them, the quality produced by image generator models isn’t at a point that I feel is passable for professional usage. Too many “papercut” sorts of errors resulting from the inescapable truth that these models actually don’t understand anything about the images they’re generating.

    11 votes
    1. [4]
      kallisti
      Link Parent
      It’s kinda like having a rubber ducky that talks back to you mostly. A lot of the time it’s only helpful to me because it suggests something so wildly stupid that in my resulting anger I end up...

      It’s kinda like having a rubber ducky that talks back to you mostly. A lot of the time it’s only helpful to me because it suggests something so wildly stupid that in my resulting anger I end up working the problem without realising it.

      8 votes
      1. [3]
        ButteredToast
        Link Parent
        The only time I’ve experienced it consistently excelling beyond that is when asking it to come up with a CLI command to do a task with (e.g. with ffmpeg), which upgrades it slightly from a duck...

        The only time I’ve experienced it consistently excelling beyond that is when asking it to come up with a CLI command to do a task with (e.g. with ffmpeg), which upgrades it slightly from a duck that talks back to a manpage that can answer questions. That’s great and legitimately useful but still not exactly a panacea.

        2 votes
        1. [2]
          Reapy
          Link Parent
          This is the best use cases I've found to use chatgpt. Something I'm familiar enough with to sanity check but am currently not mentally loaded with knowledge. A good example for me are powershell...

          This is the best use cases I've found to use chatgpt. Something I'm familiar enough with to sanity check but am currently not mentally loaded with knowledge. A good example for me are powershell scripts, I just never learned them well but they are occasionally useful and chatgbt can get me to a starting point for one off things like that.

          I am coming to think of it like the equivolent of doing 5 or 6 Google searches and parsing and combining that information for me. The downside is that if I do that I have missed the knowledge gain of spending the time seeing and working with the results to combine the information on my own, which is usually more import.

          But there is no doubt that it can be really interesting to say build a Regex or short shell script via natural language prompts.

          1. creesch
            Link Parent
            I use it similarly to replace a lot of the annoyances of google searches on problems I encounter. To be honest, at this point I am also not worried about loosing the gain you mention. Often enough...

            I use it similarly to replace a lot of the annoyances of google searches on problems I encounter. To be honest, at this point I am also not worried about loosing the gain you mention. Often enough working the results is just being annoyed by all the wrong information or irrelevant information rather than being on the right path.

            What's more, instead of a stale stack exchange answer chatGPT allows me to follow up and clarify things.

            It also has helped me a lot in deciphering spaghetti code I inherited as it is reasonably good at picking apart code blocks.

            For me it has removed the step of manually picking apart the individual bits of functionality from such code allowing me to go straight ahead to a better understanding of what is happening.

            I should say that this is only possible because I am leaning on previous experience. So the interactive rubber ducky comparison made earlier is pretty good there.

  3. Flashfall
    Link
    As someone who's been doing some research into LLMs for work (not developing them, just gathering information on capabilities and shortcomings), this article corroborates many of my own findings....

    As someone who's been doing some research into LLMs for work (not developing them, just gathering information on capabilities and shortcomings), this article corroborates many of my own findings. Impressive as they are right now, LLMs are still only glorified search engines at the moment. You might be able to get your answers more quickly and in a nicer format, but in the end the answers you're getting are still limited to whatever data the model was trained on. Any flaws or bias in the data are going to reflect in the results it generates, and trying to correct that with additional layers of data or LLM processing has mixed results at best.

    The quality and correctness of outputs from generative tasks (e.g. drafting emails or writing code) as opposed to basic Q&A tasks is also inconsistent and poor, and difficult to accurately benchmark at scale. Handling menial tasks that are comprehensively covered by the model's training data works well enough, but again, that just makes the model a glorified search engine. Anything that requires something more detailed or extensive, such as field-specific queries (medical, legal, etc), or analyzing/summarizing documents that exceed the model's context window (books, 100+ page long documents), or generating code that's bigger than a handful of functions, all tasks that would save significant amounts of time and really boost productivity, none of those are reliable yet and probably won't be for a while, and when they are they'll likely be prohibitively expensive to do at scale due to the enormous amount of computation that'll be required.

    Don't get me wrong, I still get a lot of value out of Copilot during work for finding information quickly and generating quick bits of code when I don't want to subject myself to trawling the first few pages of google, but LLMs have a lot of growing to do before they actually start replacing jobs or justifying the crazy prices some companies are paying to use them.

    5 votes
  4. Handshape
    Link
    I'm up to my armpits in these things in my professional life. They're at the very top of their hype cycle right now, and it's the biggest such cycle I've seen in my long career. Regardless of...

    I'm up to my armpits in these things in my professional life. They're at the very top of their hype cycle right now, and it's the biggest such cycle I've seen in my long career.

    Regardless of their actual capabilities and utility, there's a very steep run down into the "trough of disillusionment" expected soon.

    4 votes
  5. arqalite
    (edited )
    Link
    Anecdotally, I found ChatGPT (and all the other services that are based on OpenAI's GPTs) to keep getting progressively worse. A few months back I could get a good result by just talking to it...

    Anecdotally, I found ChatGPT (and all the other services that are based on OpenAI's GPTs) to keep getting progressively worse.

    A few months back I could get a good result by just talking to it using daily language, but nowadays I have to fill out the prompt with instructions, such that "Write a Python function that takes X and outputs Y." becomes "Write a Python function that takes parameters <x> as type A, <y> as type B and outputs Z as type C. Provide the full function code, and prefer standard library functions over third-party modules.", and even then sometimes it just refuses to return any code. Or it hallucinates modules and packages that never existed. Or just gives you code that doesn't do the task. It got so infuriating that now I just write and research everything myself, and I'm back on StackOverflow (and I hoped I never had to visit it again).

    It has a strong preference for just giving you the steps to accomplish a task instead of accomplishing the task for you, even for the most mundane stuff.

    If you asked me if LLMs would change productivity forever when it was released, I would have given a strong yes. Now, I'm not so sure.

    I've used Gemini Pro at work and it seems usable, but I dunno for how long.

    EDIT: It's still a banger for recipes though!

    2 votes