43 votes

MIT report: 95% of generative AI pilots at companies are failing

20 comments

  1. [6]
    Handshape
    Link
    I'm up to my armpits in this space; the failure rates are almost always tied to IM/IT debt, and the proponents are being sold magic beans by their vendors. "Wait... If I buy your product, I don't...

    I'm up to my armpits in this space; the failure rates are almost always tied to IM/IT debt, and the proponents are being sold magic beans by their vendors.

    "Wait... If I buy your product, I don't have to fix the gaps in my historical data?"

    Contracts get booked, salesdroids get paid, implementations fail. Wash, rinse, repeat.

    31 votes
    1. [5]
      raze2012
      Link Parent
      The money seems to be lopsided to: They're trying to use it to push more slop, instead of cutting down on expenses and increasing worker productivity. I'd love to think that this means business's...

      The money seems to be lopsided to:

      The data also reveals a misalignment in resource allocation. More than half of generative AI budgets are devoted to sales and marketing tools, yet MIT found the biggest ROI in back-office automation—eliminating business process outsourcing, cutting external agency costs, and streamlining operations.

      They're trying to use it to push more slop, instead of cutting down on expenses and increasing worker productivity. I'd love to think that this means business's have found the line where consumers will tolerate shoddy workmanship. But it may simply be because people and businesses are trying to buy less in this economy.

      19 votes
      1. [4]
        archevel
        Link Parent
        I think it is common for companies to prioritize increasing revenue rather than decreasing costs. Decreasing costs can be deferred to when it becomes necessary; bringing in new revenue is likelier...

        I think it is common for companies to prioritize increasing revenue rather than decreasing costs. Decreasing costs can be deferred to when it becomes necessary; bringing in new revenue is likelier to have a larger impact on the business. As a real example I had a system running in AWS using SQL Server at a client. That database was the main cost of the entire tech stack. We did a PoC for migrating it to PostgreSQL which would have reduced the cost by 90% IIRC. It would have taken, realistically, a few weeks to prep, test and deploy and then monitor for hiccups. That cost reduction would have payed for itself in maybe a quarter. Still haven't been done. Priority is given to things that is perceived to increase potential revenue, i.e. features.

        6 votes
        1. [2]
          raze2012
          Link Parent
          Well I work in games so I do understand the mentality. It is a lot harder to optimize a game at some point than it is to add on new features. And those new features will almost always bring more...

          Well I work in games so I do understand the mentality. It is a lot harder to optimize a game at some point than it is to add on new features. And those new features will almost always bring more eyeballs on than saying "we got the game running on 60fps on a toaster!". Companies who can sell the latter already succeeded in the former to begin with in terms of having an engaged audience who cares about the process.

          I'd expect businessmen of all people to be that "engaged audience who cares about the process" here, though. I guess in this market we just left of explosive growth they feel that it was better to just keep hyping things up. I guess that's part of why I feel AI is in this bubble; it wants to build hype in a time where everyone is hunkering down.

          1 vote
          1. Bwerf
            Link Parent
            I agree, I see that mentality a lot, and it's great until you're suddenly crashing on out of memory because the console doesn't have enough space and you have to start digging for things to...

            I agree, I see that mentality a lot, and it's great until you're suddenly crashing on out of memory because the console doesn't have enough space and you have to start digging for things to optimize rather than plan for it and do it right from the beginning. As an example of course, there's more things, like bad framerate or instability that can add up to some threshold as well. Not saying that all can be planned for, but there's definitely a tendency where I work towards "don't fix it until it's a problem that the players will notice".

            1 vote
        2. bme
          (edited )
          Link Parent
          It might be common but it's crazy. I guess if you are publicly traded to a degree you are at the whims of investors who care more about top line growth than bottom line efficiency. For privately...

          It might be common but it's crazy. I guess if you are publicly traded to a degree you are at the whims of investors who care more about top line growth than bottom line efficiency. For privately held businesses where it makes no difference where the money comes from, bottom line improvements are massive. Any effort spent here means that when rough times arrive you are prepared to weather them because you are lean and mean. You control how you deliver services, you can't control the addressable market for them or the broader economic context in which the market exists. Drives me crazy. It has been my first hand experience that being able to stay in the game longer because you are the leanest is far more valuable than how analysts and professional management seem to price it.

          1 vote
  2. [7]
    Jordan117
    Link
    As wild as the pro-AI hype is, there's an equally committed skeptic side that can be just as misleading, and this headline is a great example of it. Read the source article, or better yet the MIT...

    As wild as the pro-AI hype is, there's an equally committed skeptic side that can be just as misleading, and this headline is a great example of it. Read the source article, or better yet the MIT study it's based on, and the sentiment is much different -- basically, there's widespread use of general AI tools at the individual level, and they've significantly boosted productivity and in some cases saved millions in costs, but enterprise-level pilots that impose some sort of integration from the top-down are brittle, fail often, and rarely translate into measurable savings. Which makes sense, given ChatGPT itself is less than three years old and most CEOs are casting about for flashy customer-facing implementations without fully understanding how to get the most out of it internally.

    IMHO, the more troubling thing is that the most effective/flexible approach (personal use of general AI tools in the workplace) requires handing sensitive internal material to a third party.

    25 votes
    1. raze2012
      Link Parent
      I got that sentiment from the article with; That's impressive for a small lean team (assumedly). For even a medium sized company, that's not going to really move the bar much, though. Millions...

      the sentiment is much different -- basically, there's widespread use of general AI tools at the individual level, and they've significantly boosted productivity and in some cases saved millions in costs

      I got that sentiment from the article with;

      "Some large companies’ pilots and younger startups are really excelling with generative AI,” Challapally said. Startups led by 19- or 20-year-olds, for example, “have seen revenues jump from zero to $20 million in a year,”

      That's impressive for a small lean team (assumedly). For even a medium sized company, that's not going to really move the bar much, though. Millions could be saved in billion dollar corporations just from layoffs, no innovation needed.

      12 votes
    2. [5]
      Lia
      Link Parent
      I've been surprised to see almost no conversation about this online (or IRL for that matter). Are businesses really so trusting that they don't mind disclosing their internal information to AI...

      the more troubling thing is that the most effective/flexible approach (personal use of general AI tools in the workplace) requires handing sensitive internal material to a third party.

      I've been surprised to see almost no conversation about this online (or IRL for that matter). Are businesses really so trusting that they don't mind disclosing their internal information to AI tools, and therefore, the AI companies too? No matter what the TOS says, if there's no way for me to verify that secret information isn't being gathered, stored and potentially later exploited, I'm going to assume this is exactly what's happening. Yes, there may be law suits when the victim eventually finds out (assuming they ever do). But by then, a lot of money has already potentially been made from the exploitation, lawsuits can take years and financial repercussions are often small compared to the profit made.

      Coming up with truly valuable innovation is arguably the hardest part of business success. That's why products and services tend to get enshittified rather than new ones developed that would actually make life better. If I were an AI company with poor moral standards, I'd absolutely exploit the exposure I can get to all sorts of businesses doing their best to innovate. I'd feel like a kid in a candy store with an unlimited allowance.

      8 votes
      1. [2]
        skybrian
        Link Parent
        Buisinesses will also do things like outsource medical transcription to contractors in another country. They will use Workaday to store private HR information about all their employees. Putting...

        Buisinesses will also do things like outsource medical transcription to contractors in another country. They will use Workaday to store private HR information about all their employees. Putting private information into databases run by cloud providers in datacenters is popular. So I don’t see why outsourcing to an AI company is different?

        3 votes
        1. Lia
          (edited )
          Link Parent
          I was referring to small innovative businesses and startups rather than bloated giants. Your examples don't directly expose the business's competitive advantage and thus are not a direct threat to...

          I was referring to small innovative businesses and startups rather than bloated giants. Your examples don't directly expose the business's competitive advantage and thus are not a direct threat to the company's existence, unsavoury as they are. In many cases big businesses don't even have a clear, singular innovation-based advantage to begin with as they survive/thrive thanks to market share or similar factors.

          I spoke with someone on reddit who runs a small innovative startup and I got the impression that he fed ChatGPT Pro all their core business materials in hopes to develop it faster and/or develop new ideas to explore. He was pretty happy with the results. I asked whether he was at all concerned that the material is most likely being used for AI training and thus will eventually leak to other users. He didn't seem to have considered that before but also didn't seem to think it's a big deal. We settled on "Maximising today's revenue at tomorrow's expense", which he seemed to think was an acceptable tradeoff for him.

          Disclaimer: I have no way to verify if this is a real person. I'm assuming all AI related subs to have members whose sole purpose is to create a favourable impression of the service.

          2 votes
      2. [2]
        em-dash
        Link Parent
        One of the hardest things for me to internalize in the world of business has been that businesses actually care much more about being able to blame failures on someone else than about not having...

        One of the hardest things for me to internalize in the world of business has been that businesses actually care much more about being able to blame failures on someone else than about not having failures. It's an alien mindset to me, but realizing that it's a mindset businesses have explains so many weird decisions.

        2 votes
        1. Lia
          Link Parent
          Big businesses that are able to absorb a bunch of failures, sure. By contrast, small ones whose existence depends on their product or service creating real value can't fail too badly. They are...

          Big businesses that are able to absorb a bunch of failures, sure.

          By contrast, small ones whose existence depends on their product or service creating real value can't fail too badly. They are often running on one key idea that sets them apart. My question is how come some businesses don't seem to have any problem sharing this core idea with an LLM, and why the related risks aren't being publicly discussed.

          1 vote
  3. [3]
    tobii
    Link
    Does anyone have numbers for how that 5% compares to non GenAI pilots?

    Does anyone have numbers for how that 5% compares to non GenAI pilots?

    9 votes
    1. Aerrol
      Link Parent
      There's a little bit in the article but it's not very convincing or clear: How companies adopt AI is crucial. Purchasing AI tools from specialized vendors and building partnerships succeed about...

      There's a little bit in the article but it's not very convincing or clear:

      How companies adopt AI is crucial. Purchasing AI tools from specialized vendors and building partnerships succeed about 67% of the time, while internal builds succeed only one-third as often.

      This finding is particularly relevant in financial services and other highly regulated sectors, where many firms are building their own proprietary generative AI systems in 2025. Yet, MIT’s research suggests companies see far more failures when going solo.

      Companies surveyed were often hesitant to share failure rates, Challapally noted. “Almost everywhere we went, enterprises were trying to build their own tool,” he said, but the data showed purchased solutions delivered more reliable results.

      Other key factors for success include empowering line managers—not just central AI labs—to drive adoption, and selecting tools that can integrate deeply and adapt over time.

      4 votes
    2. parsley
      Link Parent
      Not sure about pilots/PoCs but failure rates for software projects in general is fairly high. I don't have a source for this number but I think that about 3 / 4 projects end up failing to some...

      Not sure about pilots/PoCs but failure rates for software projects in general is fairly high. I don't have a source for this number but I think that about 3 / 4 projects end up failing to some degree.

      Honestly, considering how much of a moving target genai is (every week there is a new model, or framework, or api, or tool, ...), the quality issues and fear that we probably are in the middle of a bubble, 5% sounds very good.

      2 votes
  4. [2]
    skybrian
    Link
    Apparently, 5% succeed? Perhaps with time, others will eventually hit on a more effective approach.

    Apparently, 5% succeed? Perhaps with time, others will eventually hit on a more effective approach.

    4 votes
    1. xk3
      Link Parent
      It looks like a big reason why they fail is that for complex work GenAI is not at all reliable but for simple work people prefer general-purpose LLMs:...

      It looks like a big reason why they fail is that for complex work GenAI is not at all reliable but for simple work people prefer general-purpose LLMs: https://fortune.com/2025/08/19/shadow-ai-economy-mit-study-genai-divide-llm-chatbots/

      But looking into this more deeply... it looks like NANDA is not at all unbiased. They already have a perspective that they want to build a moat around:

      10 votes
  5. Drynyn
    Link
    People joke how we made AI to write and draw and do fun creative human stuff instead of fixing toilets. Turns out the areas where ai is actually valuable is fixing those toilets, not replacing...

    People joke how we made AI to write and draw and do fun creative human stuff instead of fixing toilets.

    Turns out the areas where ai is actually valuable is fixing those toilets, not replacing human interaction.

    3 votes