36 votes

Google has released data on how much energy an AI prompt uses

16 comments

  1. [10]
    Eric_the_Cerise
    Link
    Just at a glance, I already feel like this is an(other) instance of lying with statistics ... For this data, the mean would be much more illuminating–and more damning– than the median.

    Just at a glance, I already feel like this is an(other) instance of lying with statistics ... For this data, the mean would be much more illuminating–and more damning– than the median.

    24 votes
    1. [4]
      Deely
      Link Parent
      Yeah, no mention of training, idle spending, infrastructure. Its like server contains only bare CPU and memory and no other chips, no other controllers, no perifery, etc. etc.

      Yeah, no mention of training, idle spending, infrastructure.

      Another large portion of the energy is used by equipment needed to support AI-specific hardware: The host machine’s CPU and memory account for another 25% of the total energy used.

      Its like server contains only bare CPU and memory and no other chips, no other controllers, no perifery, etc. etc.

      16 votes
      1. Macha
        Link Parent
        Unless you're running a bunch of spinning HDDs or ultra high end SSDs, CPU + GPU + PSU loss is a pretty good approximation of full system power draw.

        Unless you're running a bunch of spinning HDDs or ultra high end SSDs, CPU + GPU + PSU loss is a pretty good approximation of full system power draw.

        13 votes
      2. TemulentTeatotaler
        (edited )
        Link Parent
        They talk about some of that in the sentence right before/after what you quoted: It would be great if they covered training costs, but in their report they're upfront about it being a subject for...

        They talk about some of that in the sentence right before/after what you quoted:

        TPUs...account for just 58% of the total electricity demand of 0.24 watt-hours. Another large portion of the energy is used by equipment needed to support AI-specific hardware: The host machine’s CPU and memory account for another 25% of the total energy used. There’s also backup equipment needed in case something fails—these idle machines account for 10% of the total. The final 8% is from overhead associated with running a data center, including cooling and power conversion.

        It would be great if they covered training costs, but in their report they're upfront about it being a subject for future research. This was just a dive into the average person's per-query in-situ impact (no networking/end-user device/training/data storage considered) and how it changed over a year.

        It would also be nice to have the mean use or group patterns of use, but the median is how you'd represent the average use. Heavy users or outliers and net consumption are all important, but the average use is the sort of thing that gives some frame of reference on whether me trying to remember some philosophy based on getting lost in a city with a more semantic LLM search is killing the planet (or viable for Google).

        The median query of 12 months ago is likely pretty similar to the current ones, and it's nice to see there was a 33x improvement in those, which came down to a "23x reduction from model improvements, and a 1.4x reduction from improved machine utilization", with the improvements from stuff like mixture-of-experts for only using a subset of a model or the distillation of larger models DeepSeek showed off.

        10 votes
      3. PendingKetchup
        Link Parent
        I'm not sure the chipset on a server is an appreciable power draw? The estimate here is probably good to a factor of 2, which is probably good enough for deciding whether to send another marginal...

        I'm not sure the chipset on a server is an appreciable power draw? The estimate here is probably good to a factor of 2, which is probably good enough for deciding whether to send another marginal query to the model or not.

        If you accept the framing that you are likely to have a median query (with a median chance of cache hits?) and that what really matters is marginal query cost (and not the effect on their internal metrics that could lead them to do more training runs or design more AI products).

        6 votes
    2. [5]
      skybrian
      Link Parent
      I don’t think the mean would be very useful because there is a long tail of very expensive requests. It wouldn’t adequately represent either cheap or expensive requests. (If you take the average...

      I don’t think the mean would be very useful because there is a long tail of very expensive requests. It wouldn’t adequately represent either cheap or expensive requests. (If you take the average income of some regular people and Bill Gates, for most purposes, it’s a meaningless number.)

      The median at least tells us something about cheaper requests. If that’s the sort of request you do, you can stop worrying about those.

      But we do need more than just the median. A single number is never going to be an adequate summary when there is a lot of variation.

      10 votes
      1. [4]
        Eric_the_Cerise
        Link Parent
        For the Bill Gates case, mean is misleading. But for this Google case, I think median is misleading. It understates what people would instinctively assume is the "average" energy use. That was...

        For the Bill Gates case, mean is misleading. But for this Google case, I think median is misleading. It understates what people would instinctively assume is the "average" energy use.

        That was already my impression before I saw that they explicitly excluded image-generation and other inherently higher-energy queries.

        I mean, if one AI picture costs the energy of 10,000 AI text replies, so to speak....

        6 votes
        1. [3]
          skybrian
          Link Parent
          I think image generation should be measured separately. There's no point in mixing up text and image generation since they work so differently, often using entirely different algorithms.

          I think image generation should be measured separately. There's no point in mixing up text and image generation since they work so differently, often using entirely different algorithms.

          5 votes
          1. [2]
            Eric_the_Cerise
            Link Parent
            Except image generation wasn't measured separately; it wasn't measured at all, or – much more likely – it was measured, but not reported. Which, ultimately, this is Google self-reporting. They are...

            Except image generation wasn't measured separately; it wasn't measured at all, or – much more likely – it was measured, but not reported.

            Which, ultimately, this is Google self-reporting. They are under no requirement to be complete, or accurate, or even honest, for that matter. We actually have very little chance of confirming or refuting the numbers they have published, and even if they are caught lying about the numbers, they face no consequences.

            4 votes
            1. skybrian
              Link Parent
              Yes, it's just a start. Nobody else is reporting numbers like this at all. Hopefully they'll share more information later.

              Yes, it's just a start. Nobody else is reporting numbers like this at all. Hopefully they'll share more information later.

              5 votes
  2. skybrian
    Link
    From the article: … … …

    From the article:

    In total, the median prompt—one that falls in the middle of the range of energy demand—consumes 0.24 watt-hours of electricity, the equivalent of running a standard microwave for about one second. The company also provided average estimates for the water consumption and carbon emissions associated with a text prompt to Gemini

    [T]he AI chips—in this case, Google’s custom TPUs, the company’s proprietary equivalent of GPUs—account for just 58% of the total electricity demand of 0.24 watt-hours.

    Another large portion of the energy is used by equipment needed to support AI-specific hardware: The host machine’s CPU and memory account for another 25% of the total energy used. There’s also backup equipment needed in case something fails—these idle machines account for 10% of the total. The final 8% is from overhead associated with running a data center, including cooling and power conversion.

    Google’s figure, however, is not representative of all queries submitted to Gemini: The company handles a huge variety of requests, and this estimate is calculated from a median energy demand, one that falls in the middle of the range of possible queries.

    So some Gemini prompts use much more energy than this: Dean gives the example of feeding dozens of books into Gemini and asking it to produce a detailed synopsis of their content. “That’s the kind of thing that will probably take more energy than the median prompt,” Dean says. Using a reasoning model could also have a higher associated energy demand because these models take more steps before producing an answer.

    This report was also strictly limited to text prompts, so it doesn’t represent what’s needed to generate an image or a video. (Other analyses, including one in MIT Technology Review’s Power Hungry series earlier this year, show that these tasks can require much more energy.)

    The report also finds that the total energy used to field a Gemini query has fallen dramatically over time. The median Gemini prompt used 33 times more energy in May 2024 than it did in May 2025, according to Google. The company points to advancements in its models and other software optimizations for the improvements.

    AI data centers also consume water for cooling, and Google estimates that each prompt consumes 0.26 milliliters of water, or about five drops.

    13 votes
  3. [3]
    Bullmaestro
    Link
    Given the rise in popularity of Veo 3, and how much effort it genuinely takes to make an image, let alone a 60FPS video, I can imagine these claims of 0.24 watt-hours of electricity per prompt are...

    This report was also strictly limited to text prompts, so it doesn’t represent what’s needed to generate an image or a video. (Other analyses, including one in MIT Technology Review’s Power Hungry series earlier this year, show that these tasks can require much more energy.)

    Given the rise in popularity of Veo 3, and how much effort it genuinely takes to make an image, let alone a 60FPS video, I can imagine these claims of 0.24 watt-hours of electricity per prompt are substantially understated.

    At this rate, we are going to destroy civilization unless we ditch AI. And I'm not talking some dystopian android uprising either. We are either going to drive ourselves to collapse through mass unemployment or from the sheer amount of resources AI data centres use.

    5 votes
    1. [2]
      stu2b50
      Link Parent
      Image generation takes less energy. The models are substantially smaller, because image data works much more cleanly with linear algebra. You can run image generation models, locally, on your...

      Image generation takes less energy. The models are substantially smaller, because image data works much more cleanly with linear algebra. You can run image generation models, locally, on your phone. We’ve had image generation since the mid 2010s - useful language models were much harder.

      14 votes
      1. PendingKetchup
        Link Parent
        But video takes more, I think. Maybe text is worse with the truly huge models, but you can push acceptable text out of a 20b model, and acceptable video out of a 14b model, but the text model can...

        But video takes more, I think. Maybe text is worse with the truly huge models, but you can push acceptable text out of a 20b model, and acceptable video out of a 14b model, but the text model can run at about reading speed on equipment you can buy, while the video model runs noninteractively at like 2 minutes per 5 seconds of output. So call that 2 minutes microwaving per plausibly useful video.

        10 votes
  4. onekuosora
    Link
    So I vaguely remember the cost of traditional search being like .3Wh. A quick search found a reddit post saying similar, but does anyone know or have a better source for such info? It's not really...

    So I vaguely remember the cost of traditional search being like .3Wh. A quick search found a reddit post saying similar, but does anyone know or have a better source for such info?

    It's not really important, but it seems like it could give a point of reference. Especially since I see AI results being pushed to the point I find it likely in the future normal queries are a special/advanced search.

    1 vote