To be honest I don’t really see what the relationship is. Datacenters use water, sure, but what exactly is the mechanism for which that would affect her well? The article never explains. The...
To be honest I don’t really see what the relationship is. Datacenters use water, sure, but what exactly is the mechanism for which that would affect her well? The article never explains. The closest it gets is
Gordon Rogers is the executive director of Flint Riverkeeper, a non-profit advocacy group that monitors the health of Georgia's Flint River. He takes us to a creek downhill from a new construction site for a data centre being built by US firm Quality Technology Services (QTS).
George Dietz, a local volunteer, scoops up a sample of the water into a clear plastic bag. It's cloudy and brown.
"It shouldn't be that colour," he says. To him, this suggests sediment runoff - and possibly flocculants. These are chemicals used in construction to bind soil and prevent erosion, but if they escape into the water system, they can create sludge.
Uh, ok? So is it about construction in general or data center water usage? That’s also an entirely different situation with an entirely different data center.
Unsure about her particular well, but data centers certainly contribute to runoff pollution which could impact local drinking sources, since so many of them continue to rely on non-green energy...
Unsure about her particular well, but data centers certainly contribute to runoff pollution which could impact local drinking sources, since so many of them continue to rely on non-green energy sources. There was some additional talk about data centers and their impact on the environment here (specifically on water here), and how a lot of these data centers require fresh/clean water, opting often for public drinking water to the detriment of the local community.
None of the things you mention would have a local effect on a neighboring well, though. Non-green energy sources would typically mean coal, natural gas etc. fired power plants which are remove to...
None of the things you mention would have a local effect on a neighboring well, though. Non-green energy sources would typically mean coal, natural gas etc. fired power plants which are remove to the facility. Use of public drinking water would be piped in from the public water supply. Their waste water would be piped out via public sewage. This all relies on local building codes, and the local water system, of course. It is possible, yet unlikely, they have a well and some sort of septic for the data center.
Some things I can think of that could affect local wells would be increased runoff from parking lots/structures, and the roof of the building if their storm water is not properly mitigated.
I can see how construction could potentially cause problems. Most companies do mitigate to the extent they can. My personal feelings don’t give much benefit if the doubt to Meta, though. That...
I can see how construction could potentially cause problems. Most companies do mitigate to the extent they can. My personal feelings don’t give much benefit if the doubt to Meta, though.
That being said, I can’t speak to anything other than the data centers (DC) that I’ve worked at, but I can state categorically that those all used closed loop systems for cooling. Yes, there’s water use for personnel (i.e., bathrooms, cleaning, and kitchens) but the truth is there are relatively few people working in a DC at any given time.
Even counting the cleaning crew, there may be no more than a couple dozen people at any given moment. Contrast this with the average office building with possibly hundreds of people constantly moving in and out.
Obligatory link to relevant blog post i recently read I found it pretty well explained ( even if it's repeating many time the same concept through many different metaphores, people often need to...
I found it pretty well explained ( even if it's repeating many time the same concept through many different metaphores, people often need to hear the same thing over and over for it to finally engage with the concept).
The main jist of it, is that LLM use (personal or overall) has a negligible ecological impact, when compared to other humain activity, and it's a waste of mental ressources to worry about LLM ecological impact. (Which doesn't mean LLM's other problems are not problems, but we should be adults enough to understand that something can be bad without attributing every bad caracteristic to it)
In a lot of these conversations, I have a very strong urge to grab the other person’s shoulders and say “This is 3 Wh of energy we’re talking about!!!! We agree that’s the number! 3 Wh!!!!!! That’s so small!!!! Don’t you know this?!?!?! What happened to the climate movement????? All my climate friends used to know what 3 Wh meant!!! AAAAAAHHHHH!!!!!". This would not be very mature, so instead I post 9,000 word blog posts to let off the steam.
When I hear people say “50 ChatGPT searches use a whole bottle of water!” I think they’re internally comparing this to the few times a year they buy a bottle of water. That makes ChatGPT’s water use seem like a lot. They’re not comparing it to the 1200 bottles of water they use every single day in their ordinary lives.
This means that every single day, the average American uses enough water for 24,000-61,000 ChatGPT prompts.
Suppose you gave yourself an energy budget for goofy ChatGPT prompts. Every year, you’re allowed to use it for 1,000 goofy things (a calculator, making funny text, a simple search you could have used Google for). At the end, all those prompts together would have used the same amount of energy as running a single clothes dryer a single time for half an hour. This would increase your energy budget by 0.03%. This is not enough to worry about. If you feel like it, please goof around on ChatGPT.
it’s as if everyone suddenly started obsessing over whether the digital clocks in our bedrooms use too much energy and began condemning them as a major problem. It’s sad to see the climate movement get distracted. We have gigantic problems and real enemies to deal with. ChatGPT isn’t one of them.
I don't know how to say this in a way that isn't offputting, (edit, looks like teaearlgreycold had a point and I was overconfident), please please please read the blog post even if (especially if)...
I don't know how to say this in a way that isn't offputting, but that's really wrong (edit, looks like teaearlgreycold had a point and I was overconfident), please please please read the blog post even if (especially if) you feel like you are allready particularly well informed. It adresses a lot of common misconception.
Training GPT-4 used 50 GWh of energy. Like the 20,000 households point, this number looks ridiculously large if you don’t consider how many people are using ChatGPT. The numbers here are very uncertain, but my best guess based on available data says that since GPT-4 was trained, it answered around 50 billion prompts, until it was mostly replaced with GPT-4o. GPT-4 and other models were used for a lot more than ChatGPT — Notion, Grammarly, Jasper, AirTable, Khan Academy, Duolingo, GitHub Copilot — but to be charitable let’s assume it was only used for chatbots. Dividing 50GWh by 50 billion prompts gives us 1 Wh per prompt. This means that including the cost of training the model (and assuming each prompt is using 3 Wh) raises the energy cost per prompt by 33 percent, from the equivalent of 10 Google searches to 13. That’s not nothing, but it’s not a huge increase per prompt.
In the paragraph immediately following: (emphasis mine) and then: I think if you look at the AI ecosystem as a whole, this might not be true. Think of all of the models and experiments that never...
In the paragraph immediately following:
There are a lot more AI models being trained, collectively using a lot of energy.
(emphasis mine) and then:
It seems like the only reasonable way to judge how bad this is is to divide the cost of training by how many prompts we can expect the specific model to deal with.
I think if you look at the AI ecosystem as a whole, this might not be true. Think of all of the models and experiments that never get released. My gut feeling on the naive training/use ratio was off but I don't think the 33% energy increase is the full story. Most models are not anywhere near as successful as GPT-4. So they would have a worse ratio.
Models made these days are far more expensive to train than GPT-4. After GPT-4 released we've hit serious diminishing returns due to lack of training data. This is at odds with investor demand for more capability. The solution is to throw more money (energy) at the problem. That goes for both AI developers and NVIDIA. In the face of absolute demand for FLOPs they've made chips that run at ludicrous TDPs. They're trying to cheat on their roadmap by pumping more power into their cards which means each rack is even more power hungry while we also see companies dramatically increase their number of racks deployed.
it’s estimated that it took 90–100 days to train GPT-4
That’s 90 or 100 * 24 = 2,160 to 2,600 hours per server.
we can multiply the number of hours by 6.5 kW, and we get that during training, each server may have consumed 14,040 to 16,900 KWh of electricity.
Let’s multiply that by the 3,125 servers needed to host 25,000 GPU’S: 3,125 * 14,040 to 16,900 KWh = 43,875,000 to 52,812,500 KWh.
Do you think OpenAI only had GPUs spun up for 90 days to train GPT-4? I can't find anything online but it was probably under development for much longer - with GPUs at full bore for most of that time. AI companies see idle GPUs as lost money. From what I've seen first hand with smaller models they experience many training runs during the development lifecycle. Also consider the energy used to fine-tune and update it after release. They updated GPT-4 every few months while it was out to continually increase "safety" and add more recent information to the training data. This certainly did not require the full 50GWh each time but would meaningfully affect the energy use. And each update itself would require some experimentation.
Of course they want to save money, so they must have done partial training runs whenever possible and terminated early once they knew any experiment should be terminated.
You're right that the estimation is likely lowballing real consumption. But it was just an estimation to show the order of magnitude involved, not an exact number to nitpick... sure, let's assume...
You're right that the estimation is likely lowballing real consumption. But it was just an estimation to show the order of magnitude involved, not an exact number to nitpick...
sure, let's assume the server that trained gpt4 run 24/7/365 and that most other LLM company are not as used as chatgpt... Let's assume that total consuption is actually 10 times our initial assumption... And that's still low enough that i can offset that by taking a cold shower instead of a hot one once a week.
I think the main point still stands. (To clarify, I agree with you now that training might cost as much as usage, thought we cann't be sure, but i think overall consumption is still negligible, juste like I wouldn't care if my alarm clock was 10 times more energy hungry)
I totally get that! Do you think we could give a more accurate approximation of overall consumption by going global. Like how many GPU.hour are owned by AI companies and assuming they run 24/7....
I totally get that!
Do you think we could give a more accurate approximation of overall consumption by going global.
Like how many GPU.hour are owned by AI companies and assuming they run 24/7. Could we just assume the H100 gpu represents most of the AI companies' compute? Each is 700 watt max TDP, so (x8760 hour in the year) 6 million watt hours per year. Times 2 millions H100 sold in 2024 worldwide --> 12000 gigawatt hour.
And divide that by all model total prompt... Not sur how to get that, maybe assume it's proportional to the number of H100 possessed, so we could use open AI number to estimate total number? But then we might as well just estimate chat gpt alone:
Anyway open AI owns 700,000 H100 and has 500 millions "users" (couldn't find good sources for those).
So 700000x6million divided by 500million so 8400watt hour per user.
That's about equal to running a normal oven for 5 hours. Which is much more than I expected, (it seems you are right that training is more than actual inferring).
(Oups, edited after you answered)
Also, i'm pretty sure the other models are also used by the same 0.5 billion users, so the fairest would be to count 3 times that. Which is still negligable.
I would just take the quantity of GPUs they own and multiply by 80-90% utilization, multiplied by time. Add in overhead for the rest of the data center. You’ll have to do a lot of estimating though.
I would just take the quantity of GPUs they own and multiply by 80-90% utilization, multiplied by time. Add in overhead for the rest of the data center. You’ll have to do a lot of estimating though.
I've heard that the high electricity and water use are not necessarily from individual prompting, but the fact that every available CPU*second is used for further training of models. So,...
I've heard that the high electricity and water use are not necessarily from individual prompting, but the fact that every available CPU*second is used for further training of models. So, boycotting the models does literally nothing to limit/lower their resource usage.
I think a lot of the "x Whs per query" statements are misleading; Even if there are zero queries to ChatGPT, it still represents hundreds of thousands of 100% uptime GPU's which does take an absurd amount of energy and water to maintain.
Boycotting LLMs makes their training and their supporting data centres less profitable. Its just slow/ long term. Also, it is against the tide and wont succeed in 2025/2026.
Boycotting LLMs makes their training and their supporting data centres less profitable. Its just slow/ long term. Also, it is against the tide and wont succeed in 2025/2026.
This article is not the best example, but companies are also building data centers in desert regions where every drop of water is already needed for the people who live there.
This article is not the best example, but companies are also building data centers in desert regions where every drop of water is already needed for the people who live there.
I think we should be more nuanced about the word "use". It's not like datacenters are doing electrolysis on the water. The water exists before and after. With, say, a factory that dyes clothes,...
I think we should be more nuanced about the word "use". It's not like datacenters are doing electrolysis on the water. The water exists before and after.
With, say, a factory that dyes clothes, they also "use" water. The cost associated with the use is that the water is contaiminated with dyes, and must be treated. If they dump it without treating it, then not only will it need to be treated later, but it also damages any organisms that try to use the water in between.
Datacenters use water in two main ways: the most common way is in a closed loop, as a heat exchanger. This essentially works like a water-cooling loop in a desktop computer, just much bigger. The second, less common type, is as evaporative cooling.
For the first type, to be frank, almost no water is being "used". The water is going around in a loop. It's not being contaminated with anything, and while some water will leak and need to be replenished, it's ultimately very marginal.
The second, I mean, the water just becomes water vapor? It'll just enter the water cycle again. It's not like the water is leaving the region, or becoming unusable.
In the desert, water that evaporates likely won't reenter as liquid water for months or years Lots of places are depleting the water table, the aquifer
In the desert, water that evaporates likely won't reenter as liquid water for months or years
Lots of places are depleting the water table, the aquifer
There is a DC planned to go in near me, and there are some people who are feverently against it for a variety of reasons that range from reasonable concern (noise, valid concern but not applicable...
There is a DC planned to go in near me, and there are some people who are feverently against it for a variety of reasons that range from reasonable concern (noise, valid concern but not applicable in this case) to the outlandish (it's a partnership with Los Almos, who created the atomic bomb, which makes them evil, obviously /s)
I really do wonder if AI will be a net positive for humanity. I saw a video yesterday where bill Gates himself says he’s not entirely sure: https://youtube.com/shorts/uoU_1KORocQ
I really do wonder if AI will be a net positive for humanity. I saw a video yesterday where bill Gates himself says he’s not entirely sure: https://youtube.com/shorts/uoU_1KORocQ
I don't see how it could be unless it's value was socialized. I also don't see how it could be without an incredibly thoughtful approach that I don't see coming from a hyper-capitalist society.
I don't see how it could be unless it's value was socialized. I also don't see how it could be without an incredibly thoughtful approach that I don't see coming from a hyper-capitalist society.
To be honest I don’t really see what the relationship is. Datacenters use water, sure, but what exactly is the mechanism for which that would affect her well? The article never explains. The closest it gets is
Uh, ok? So is it about construction in general or data center water usage? That’s also an entirely different situation with an entirely different data center.
Unsure about her particular well, but data centers certainly contribute to runoff pollution which could impact local drinking sources, since so many of them continue to rely on non-green energy sources. There was some additional talk about data centers and their impact on the environment here (specifically on water here), and how a lot of these data centers require fresh/clean water, opting often for public drinking water to the detriment of the local community.
None of the things you mention would have a local effect on a neighboring well, though. Non-green energy sources would typically mean coal, natural gas etc. fired power plants which are remove to the facility. Use of public drinking water would be piped in from the public water supply. Their waste water would be piped out via public sewage. This all relies on local building codes, and the local water system, of course. It is possible, yet unlikely, they have a well and some sort of septic for the data center.
Some things I can think of that could affect local wells would be increased runoff from parking lots/structures, and the roof of the building if their storm water is not properly mitigated.
I can see how construction could potentially cause problems. Most companies do mitigate to the extent they can. My personal feelings don’t give much benefit if the doubt to Meta, though.
That being said, I can’t speak to anything other than the data centers (DC) that I’ve worked at, but I can state categorically that those all used closed loop systems for cooling. Yes, there’s water use for personnel (i.e., bathrooms, cleaning, and kitchens) but the truth is there are relatively few people working in a DC at any given time.
Even counting the cleaning crew, there may be no more than a couple dozen people at any given moment. Contrast this with the average office building with possibly hundreds of people constantly moving in and out.
Obligatory link to relevant blog post i recently read
I found it pretty well explained ( even if it's repeating many time the same concept through many different metaphores, people often need to hear the same thing over and over for it to finally engage with the concept).
The main jist of it, is that LLM use (personal or overall) has a negligible ecological impact, when compared to other humain activity, and it's a waste of mental ressources to worry about LLM ecological impact. (Which doesn't mean LLM's other problems are not problems, but we should be adults enough to understand that something can be bad without attributing every bad caracteristic to it)
Most of the energy use comes from training, not the use by end users afterwards.
I don't know how to say this in a way that isn't offputting,
but that's really wrong(edit, looks like teaearlgreycold had a point and I was overconfident), please please please read the blog post even if (especially if) you feel like you are allready particularly well informed. It adresses a lot of common misconception.In the paragraph immediately following:
(emphasis mine) and then:
I think if you look at the AI ecosystem as a whole, this might not be true. Think of all of the models and experiments that never get released. My gut feeling on the naive training/use ratio was off but I don't think the 33% energy increase is the full story. Most models are not anywhere near as successful as GPT-4. So they would have a worse ratio.
Models made these days are far more expensive to train than GPT-4. After GPT-4 released we've hit serious diminishing returns due to lack of training data. This is at odds with investor demand for more capability. The solution is to throw more money (energy) at the problem. That goes for both AI developers and NVIDIA. In the face of absolute demand for FLOPs they've made chips that run at ludicrous TDPs. They're trying to cheat on their roadmap by pumping more power into their cards which means each rack is even more power hungry while we also see companies dramatically increase their number of racks deployed.
Here's where the 50GWh number comes from:
https://medium.com/data-science/the-carbon-footprint-of-gpt-4-d6c676eb21ae
Do you think OpenAI only had GPUs spun up for 90 days to train GPT-4? I can't find anything online but it was probably under development for much longer - with GPUs at full bore for most of that time. AI companies see idle GPUs as lost money. From what I've seen first hand with smaller models they experience many training runs during the development lifecycle. Also consider the energy used to fine-tune and update it after release. They updated GPT-4 every few months while it was out to continually increase "safety" and add more recent information to the training data. This certainly did not require the full 50GWh each time but would meaningfully affect the energy use. And each update itself would require some experimentation.
Of course they want to save money, so they must have done partial training runs whenever possible and terminated early once they knew any experiment should be terminated.
You're right that the estimation is likely lowballing real consumption. But it was just an estimation to show the order of magnitude involved, not an exact number to nitpick...
sure, let's assume the server that trained gpt4 run 24/7/365 and that most other LLM company are not as used as chatgpt... Let's assume that total consuption is actually 10 times our initial assumption... And that's still low enough that i can offset that by taking a cold shower instead of a hot one once a week.
I think the main point still stands. (To clarify, I agree with you now that training might cost as much as usage, thought we cann't be sure, but i think overall consumption is still negligible, juste like I wouldn't care if my alarm clock was 10 times more energy hungry)
I’m not trying to make anyone feel bad about using AI. I’m just here to discuss numbers.
I totally get that!
Do you think we could give a more accurate approximation of overall consumption by going global.
Like how many GPU.hour are owned by AI companies and assuming they run 24/7. Could we just assume the H100 gpu represents most of the AI companies' compute? Each is 700 watt max TDP, so (x8760 hour in the year) 6 million watt hours per year. Times 2 millions H100 sold in 2024 worldwide --> 12000 gigawatt hour.
And divide that by all model total prompt... Not sur how to get that, maybe assume it's proportional to the number of H100 possessed, so we could use open AI number to estimate total number? But then we might as well just estimate chat gpt alone:
Anyway open AI owns 700,000 H100 and has 500 millions "users" (couldn't find good sources for those).
So 700000x6million divided by 500million so 8400watt hour per user.
That's about equal to running a normal oven for 5 hours. Which is much more than I expected, (it seems you are right that training is more than actual inferring).
(Oups, edited after you answered)
Also, i'm pretty sure the other models are also used by the same 0.5 billion users, so the fairest would be to count 3 times that. Which is still negligable.
I would just take the quantity of GPUs they own and multiply by 80-90% utilization, multiplied by time. Add in overhead for the rest of the data center. You’ll have to do a lot of estimating though.
I've heard that the high electricity and water use are not necessarily from individual prompting, but the fact that every available CPU*second is used for further training of models. So, boycotting the models does literally nothing to limit/lower their resource usage.
I think a lot of the "x Whs per query" statements are misleading; Even if there are zero queries to ChatGPT, it still represents hundreds of thousands of 100% uptime GPU's which does take an absurd amount of energy and water to maintain.
Boycotting LLMs makes their training and their supporting data centres less profitable. Its just slow/ long term. Also, it is against the tide and wont succeed in 2025/2026.
This article is not the best example, but companies are also building data centers in desert regions where every drop of water is already needed for the people who live there.
I think we should be more nuanced about the word "use". It's not like datacenters are doing electrolysis on the water. The water exists before and after.
With, say, a factory that dyes clothes, they also "use" water. The cost associated with the use is that the water is contaiminated with dyes, and must be treated. If they dump it without treating it, then not only will it need to be treated later, but it also damages any organisms that try to use the water in between.
Datacenters use water in two main ways: the most common way is in a closed loop, as a heat exchanger. This essentially works like a water-cooling loop in a desktop computer, just much bigger. The second, less common type, is as evaporative cooling.
For the first type, to be frank, almost no water is being "used". The water is going around in a loop. It's not being contaminated with anything, and while some water will leak and need to be replenished, it's ultimately very marginal.
The second, I mean, the water just becomes water vapor? It'll just enter the water cycle again. It's not like the water is leaving the region, or becoming unusable.
In the desert, water that evaporates likely won't reenter as liquid water for months or years
Lots of places are depleting the water table, the aquifer
There is a DC planned to go in near me, and there are some people who are feverently against it for a variety of reasons that range from reasonable concern (noise, valid concern but not applicable in this case) to the outlandish (it's a partnership with Los Almos, who created the atomic bomb, which makes them evil, obviously /s)
I really do wonder if AI will be a net positive for humanity. I saw a video yesterday where bill Gates himself says he’s not entirely sure: https://youtube.com/shorts/uoU_1KORocQ
I don't see how it could be unless it's value was socialized. I also don't see how it could be without an incredibly thoughtful approach that I don't see coming from a hyper-capitalist society.