19

I'm trying to come up with a heuristic to estimate how much energy (say, in Joules) a process or a thread has consumed between two time points. This is on a PC (Linux/x86), not mobile, so the statistics will be used to compare the relative energy efficiency of computations that take similar wall-clock time.

The idea is to collect or sample hardware statistics such as cycle counter, p/c states or dynamic frequency, bus accesses, etc., and come up with a reasonable formula for energy usage between measurements. What I'm asking is whether this possible, and what this formula might look like.

Some challenges that come to mind: 1) Properly accounting for context switches to other processes (or threads).

2) Properly accounting for the energy used outside the CPU. If we assume negligible I/O, that means mostly RAM. How does allocation amount and/or access pattern affect energy usage? (That is, assuming I have a way to measure dynamic memory allocation to begin with, e.g., with a modified allocator.)

3) Using CPU time as an estimate is limited to coarse-grain and oft-wrong accounting, CPU energy usage only, and assumes fixed clock frequencies. It includes, but doesn't account well for, time spent waiting on RAM.

Eitan
  • 862
  • 1
  • 7
  • 17
  • 2
    How about GPU, disk (IOH controller, head seek (but only if using a spinning disk)), sound generation (amplification), network (wireless radio use). Don't forget the effect of CPU/GPU speed stepping when on battery vs. mains. – Michael Petrotta Dec 19 '10 at 21:30
  • Yes, those are all additional challenges. But for the moment, I'd rather focus on computational processes, i.e., CPU/BUS/memory only. That's hard enough :) – Eitan Dec 19 '10 at 23:04
  • 1
    I am also working on that. My first asumption is to based my measures on C-States and frequencies (for CPU supporting that). Then, adding cache misses events, maybe interruptions... My questions are: 1. these measures could be relevant? 2. how to rely these events with some power estimations? – Jérôme Apr 06 '11 at 07:56
  • I commend your effort, but i have to wonder how accurate you want to be and what for? The baseline cpu consumption on a modern PC varies widely between cpus. USB, WIFI, each hard drive all add considerably to the overall consumption, and a GPU puts all their consumption to shame. Even if you counted all that - the PSU is has a HUGE impact on power consumption. A good high rated psu has far lower losses then a cheap PSU that is at the edge of what it can provide... Going back to accuracy - if you want something realistically accurate then a very simple rough measurement of(more in next comment) – NightDweller Apr 11 '11 at 23:06
  • the overall cpu usage is all that you need - simply because of the many hard-to-know variables which have a much larger impact (like the type of PSU). An i5 at idle consumes about 62 watts, at full load it consumes about 96.7 assuming linearity you get 0.34 watt for each percentage point. You lose more then that on the differences in operating temperature of the cpu (higher temp increases power consumption). If an accuracy of give or take 1 watt is acceptable then you can definitely do with just the overall cpu usage. and btw: just getting the measurements is a huge undertaking – NightDweller Apr 11 '11 at 23:17
  • Memory allocation itself is negligible. Memory access is not, but a lot of that will be cached, so you could count cache misses (if there's a register keeping track). The cache also consumes a fair amount of power, though, so perhaps you want to look at cache hints. The other problem is that it's difficult to separate out the different execution units (ALU, FP, MMX/SSE, etc, which may be powered down if they have not been used in a while) and a lot of numbercrunching now uses the GPU... – tc. Apr 12 '11 at 01:37
  • @NightDweller: the aim is HPC machine: so I don't have to bother with devices, screen... GPU (for computation) could be interesting, but for now I would like to focus on CPU. When you are talking about the the i5 consumption, where did you get these numbers? Is it TDP? I wonder how reliable this value can be. Concerning you assumption "the overall cpu usage is all that you need", I am surprised that cache misses can not be view as an important power estimation, but recent litterature goes in your direction. – Jérôme Apr 12 '11 at 14:34
  • @tc: I am looking about the memory accesses, but like I said in my previous comment, it seems CPU usage could be the main indicator of energy consumption. About the different execution units, I think it is far too complex with modern architecture to be taken into account, compared with the gained precision. – Jérôme Apr 12 '11 at 14:35
  • @Jérôme - my numbers are based on benchmarks from [Anandtech](http://www.anandtech.com/show/4048/amds-winter-update-athlon-ii-x3-455-phenom-ii-x2-565-and-phenom-ii-x6-1100t/6) they have a good level of details on their methodology and other system parameters. They also have a nice chart system that lets you view these results for a large collection of processors. If you're looking at HPC - i recall seeing some interesting figures (processing power per watt) for Arm and other embedded processors.(can't find the link at the moment). but the type of computation is very important (more) – NightDweller Apr 12 '11 at 15:04
  • Some algorithms are far more efficient on a vector processor (some code cracking tasks have been demonstrated very efficiently on a GPU) some are better on a stock PC with a good floating point processor, and some work very well on an embedded processor.... – NightDweller Apr 12 '11 at 15:07
  • 1
    What the question asks basically isn't possible. Unless you have only one task/process/thread in the system, you can't sum things up and account for where all the energy came from. Too many resources are shared, too many policies are affected by overall activity, and too many devices have power states kept high/low by background tasks. I've written an answer about how you might measure the power figures and do a 'finger in the air' calculation for CPU activity, but this is going to be wildly inaccurate in many cases. It does have some limited use, however, so it's worth having a go! – John Ripley Apr 12 '11 at 19:11
  • I should add - in a tightly controlled embedded system (e.g a phone or mp3 player), you *can* have some success doing this, because you *can* account for who's keeping devices up/down, and do tricks to measure bus access etc. It's still wildly inaccurate but if some processes are orders of magnitude higher in some stats, it's useful. On a PC, there just isn't this level of control or stats gathering. – John Ripley Apr 12 '11 at 19:14
  • 1
    @Jérôme - There's [a system review on anandtech](http://www.anandtech.com/show/4257/puget-systems-obsidian-solid-as-a-rock/3) i thought you would find interesting. They show a system (Core i5 2500k based PC) that draws as little as 31 watt on Idle and only 92 watt under load(!). This is **overall** power consumption. I think this is a good demonstration of how careful component selection and system design can dramatically influence power consumption. – NightDweller Apr 14 '11 at 12:12

5 Answers5

14

You may be able to get a figure for the power consumption of your process, but it will only be correct in isolation. For example, if you ran two processes in parallel, you're unlikely to fit a straight line with good accuracy.

This is hard enough to do on embedded platforms with a complete break-out of every voltage rail, let alone on a PC where your one data point is the wattage from the outlet. Things you'll need to measure and bear in mind:

  • Base load ain't so base. A system idle for many seconds will be in a deeper sleep state than one which isn't. Do you measure 'deep' sleep or just idle? How do you know which you're measuring?
  • Load isn't always linear. Variable voltage: some components shift voltage up/down depending on load and frequency. Temperature: can go either way these days (not just thermal runaway).
  • Power supplies aren't the same efficiency at all loads. If you're measuring outlet wattage, you need to bear this in mind. For example, it could be 50% efficient below 100W, 90% from 100-300W and down to 80% 300W+.
  • Additional processes won't necessarily add linearly. For example, once DDR is out of idle, its base load increases, but additional processes won't make that any worse. This is even more unpredictable with multiple cores and variable frequencies.

The basic way to measure it is the obvious way: record number of watts in idle, record number of watts in use, subtract. You can try running at 50% duty cycle, 25%, 75% and so on, to draw a pretty graph (linear or otherwise). This will show up any non-linearity. Unfortunately conversion efficiency vs load for both CPU regulator and PSU will be the dominant cause. There's not much you can do to eliminate that without having a development version of the motherboard you're playing with (unlikely), or if you're lucky enough to have a PSU with a graph of efficiency vs load.

However, it's important to realize that these data points are only correct in isolation. You can do a pretty good job of modeling how these things will sum up in the system, but be very aware that it's only a good approximation at best. Think of it as being equivalent to looking at some C code for an audio codec and estimating how fast it'll run. You can get a good general idea, but expect to be wildly inaccurate when measured in reality.

Edit - Expanding a little as the above doesn't really answer how you might go about it.

Measuring power consumption: get yourself an accurate wattage meter. As I mentioned, unless you have a way to break out the individual voltage rails and measure current, the only measurement you can make is at the outlet. Alternatively, if you have access to the health monitoring status on the motherboard, and that has current (amps) reporting (rare), that can give you good accuracy and fast response times.

So, measure base wattage - pick whatever situation you think of as "base". Run your test, and measure "peak". Subtract, done. Yes, that's fairly obvious. If you have something where the difference is so small it's lost in the noise, you can try measuring energy usage over time instead (e.g kWh). Try measuring an hour at idle vs an hour with your process running flat out, and see the total energy difference. Repeat similarly for all types of test you want to perform.

You will get noticeable wattage differences for heavy CPU, DDR and GPU users. You might notice the difference between L1 vs L2 vs DDR constrained algorithms (DDR uses much more power), if you're careful to note that the L1/L2 constrained algorithms are running faster - you need to account for energy used per "task" not continuous power. You probably won't notice hard disk access (it's actually just a watt or two and lost in the noise in a PC) other than the performance hit. One extra data point worth recording is how much "base" load increases if you have a task waking up every 100ms or so, using 1% of CPU. That's basically what non-deep-sleep idle looks like. (This is a hack and 100ms is a guess)

Beware that 1% may be different from 1% at another time, if you have a CPU with frequency changing policies enabled.

One final big note: it's of course energy you should be measuring, just as you titled the question. It's very easy to make the mistake of benchmarking power consumption of one task vs another and to conclude one is more expensive... if you forget about the relative performance of them. This always happens with bad tech journalists benchmarking hard disk vs SSD, for example.

On embedded platforms with current monitoring across many rails, I've done measurements down to nanojoules per instruction. It's still difficult to account for energy usage by thread/process because there's a lot of load that's shared by many tasks, and it can increase/decrease outside of its timeslice. On a PC, I'm not sure you'll manage to get as fine grained as that :)

John Ripley
  • 4,434
  • 1
  • 21
  • 17
7

This is the topic of ongoing research. So don't expect any definite answers. Some publications you might find interesting are for example:

  • Chunling Hu, Daniel A. Jiménez and Ulrich Kremer, Efficient Program Power Behavior Characterization, Proceedings of the 2007 International Conference on High Performance Embedded Architectures & Compilers (HiPEAC-2007), pp. 183--197, January 2007. (pdf)

  • Adam Lewis, Soumik Ghosh, and N.-F. Tzeng, Run-time Energy Consumption Estimation Based on Workload in Server Systems, USENIX 2008, Workshop on Power Aware Computing and Systems (html pdf)

But you can easily find many more using Google Scholar and Citeseer.

Mackie Messer
  • 7,118
  • 3
  • 35
  • 40
4

On Linux, try the PowerTOP utility. However, rather than computing absolute values in Joules, it focuses on relative power usage between various system components.

Chris Dolan
  • 8,905
  • 2
  • 35
  • 73
  • Thanks. I believe powertop is mostly about finding "battery killers", specifically processes that cause the CPU to wake up from idle, preventing low-energy state. Put that is not the only way to consume power, especially with a CPU-bound process – Eitan Dec 19 '10 at 23:09
  • @Eitan: what power does not come from the battery? :-) But yes, I agree that it won't answer your exact question. – Chris Dolan Dec 20 '10 at 02:00
  • "Wakeups-from-idle" are not the only way a CPU consumes power; it is merely a common cause of *unnecessary* power consumption. See "tickless kernels", since 2.6.something. – tc. Apr 12 '11 at 01:32
  • Do you know what happened to the [announced version 2.0 of PowerTop](http://lwn.net/Articles/421733/)? The link is no longer available. – JJD Oct 08 '11 at 23:31
2

Intel's Energy Efficient Software Guidelines has a host of useful info, including a link to their own Application Energy Toolkit. which includes...

2) Application Energy Graphing Tool

The Application Energy Graphing Tool is an interactive tool that can measure the battery power consumption of an application over time, and log and graph the resulting data.

Application developers can use the Application Energy Graphing Tool to help them design applications that conserve battery power on mobile computer systems.

Roddy
  • 66,617
  • 42
  • 165
  • 277
  • I am familiar with both. The toolkit is an API that lets you format the inputs from an external power meter, which I don't have. The guidelines are a little too general and only help in reducing power consumption, not in measuring it – Eitan Dec 19 '10 at 23:07
1

AMD uProf - provides per OS process with absolute energy(mJ) units per process.

Intel Platform Power Estimation Tool (IPPET) - prototype with absolute energy(mWh) units per process.

Intel SocWatch (part of Intel System Studio) - has a lot of low-level metrics, but without absolute energy(mWh/mJ) units per process.

Oleg Neumyvakin
  • 9,706
  • 3
  • 58
  • 62