Much of my academic research involves statistics and crunching through big datasets. To do this, I use computer clusters like Amazon’s EC2 and a cluster at the Harvard MIT Data Center. I will frequently kick of a job to run overnight on the full HMDC cluster of ~100 computers. Some of my friends do so nearly every night on similar clusters. Like many researchers and engineers, it costs me nothing to kick off a big job. That said, computers consume a lot of energy so I did a little back-of-the-envelope calculation to figure out what the cost in terms of resources might add up to.
An overnight job that uses a 100 computer cluster might use 800 computer-hours. Although power efficiency varies hugely between computers, most statistical analysis is CPU intensive and should come close to maximizing power consumption. According to a few sources [e.g., 1 2 3], 200 watts might be a conservative estimate of much a modern multi-CPU server will draw under high load and won’t include other costs like cooling. Using this estimate, the overnight job on 100 machines would easily use 160 kilowatt hours (kWh) of energy.
In Massachusetts, most of our power comes from coal. This page suggests that an efficient coal plant will generate 2,460 kWh for each ton of coal. That means that one overnight job would use 59 kg (130 lbs) of coal. In the process, you would also create 153 kg (338 lb) of CO2 and a bit under half a kilogram (about 1 lb) of nitrogen oxides and sulfur dioxide each. It’s a very rough estimate but it certainly generates some pressure to make sure the research counts!
Of course, I’ve written some free software that runs on many thousands of computers and servers. How many tons of coal are burnt to support laziness or a lack of optimization in my software? What is the coal cost of choosing to write a program in a less efficient, but easier to write, higher-level programming languages like Python or Ruby instead of writing a more efficient version in C?
One thing to bear in mind is that (at least in the UK, maybe not so much where you are) is that the night-time electricity production is often the most efficient due to the mix of generation systems in use (the ones which are spun up on demand to meet the peak consumption are generally the least efficient). You can see this for the uk on http://realtimecarbon.org/ where the graph dips below average at night and above at peak times.
What about the cost to support the developers that write in the low-level languages? That’s an awful lot of American-style consumption that has to be paid for, and a lot of fossil fuel burning to get them to and from work.
I think these are valid points and I’ll add a bit to them for you. I work with a project that helps scientists use HPC for bioinformatics (among other things…). Our colleague in the HPC center has often noted that spending $10,000 to hire a good programmer to rewrite things in C or C++ is far more cost effective than trying to optimize scripts in Perl, Python and so on. He didn’t bring up the cost of the power, but did discuss runtime and the shear cost of building and maintaining the network and wanting to get the most efficient use out of it. Things that take longer to run means less availability for other science, and less cost effectiveness overall. Paying the staff who run a large cluster like this and paying for the time to use it often cost far more than rewriting in a more suitable language.
If using a proprietary compiler decreases a job’s runtime by 15%, is it better for society to use the proprietary compiler, or stick with GCC?
Jo: It’s true that there are multiple things in the world one might (and should!) care about and that sometimes they might come into conflict.
It seems that in the case you suggest, the best thing would be to improve GCC so that it was at least as good as the proprietary compiler so that developers would not need to choose.
Keep in mind that the average first-world citizen uses maybe 8000 watts, and the average US resident around 11000. Really conscientious-but-still-on-the-grid people can get it down to 5000 by not driving or flying, not eating beef or lamb, etc. So 200w * 100 machines means you’ve tripled your impact for the length of the job. This is not trivial, but not huge either, esp. when compared to air flights or a suburban high-drive, high-AC lifestyle where it is hot, unless you are doing it every single night. George B.’s comments are also quite relevant: the carbon impact of those watts is much lower at night.
I’m missing something. How can you burn 59 kg of coal, yet produce 153 kg of CO2 from it?
What happened to conservation of mass?
Fuzzy Math: I was confused by this as well but the number is correct.
Remember, burning coal requires both the coal and oxygen. One mole of carbon weighs 12 grams and, if the burning were completely efficient, it would turn in to 1 mole of CO2 which would weigh 44 grams. Oxygen is more massive than carbon, after all, and there’s twice as much of it in CO2. The oxygen doesn’t need to be shipped there, but it still ends up in the output.