7

Our tool generates performance logs in diagnostic mode however we track the performance as in code execution time (Stopwatch + miliseconds).

Obviously it's not reliable at all, the testing system's CPU can be used by some random process, results will be totally different if you the tool configured to run 10 threads rather than 2, etc.

My question is:

What's the correct way to find out correct CPU time for a piece of code (not for the whole process)?

What I mean by CPU Time:

Basically how much cycle CPU spent. I assume this will be always a same for the same piece of code in the same computer and not effected by other processes. There might be some fundamental stuff I'm missing in here, if so please enlighten me in the comments or answers.

P.S. Using a profiler is not possible in our setup

Another update,

Why I'm not going to use profiler

Because we need to test the code in different environments with different data where we don't have a profiler or a IDE or something like that. Hence code itself should handle it. An extreme option can be using a profiler's DLL maybe but I don't think this task requires such a complex solution (assuming there is no free and easy to implement profiling library out there).

dr. evil
  • 26,944
  • 33
  • 131
  • 201
  • 1
    Can you define "CPU time?" Real time (in milliseconds/ticks/etc) will vary between machines, and even the number of processor cycles it takes will vary on OS, CPU, etc... – Dan Puzey Jan 18 '11 at 13:10
  • I updated the question: "How much cycle CPU spent, it's OK as soon as it's same for the same computer in different executions and not effected by other processes" – dr. evil Jan 18 '11 at 13:22
  • 1
    You can't without a profiler, which is exactly the tool you need to do it. – Jonathan Grynspan Jan 18 '11 at 13:24
  • @ Jonathan - I do have .NET, assuming profilers are bunch of tools written in .NET, I don't see why not? Why do you think it's not possible? – dr. evil Jan 18 '11 at 13:26
  • Maybe something here will help you: http://www.codeproject.com/KB/dotnet/dotnetprofiler.aspx – Greg Jan 18 '11 at 13:44
  • @dr. evil - Can you explain *why* you can't use a profiler in your setup? Your question is asking how to make a profiler without calling it a profiler... – Justin Jan 18 '11 at 14:17
  • @dr. evil: You don't want to use a profiler, but can you run it under an IDE? That's what I do, and use random-pausing. OTOH, maybe you could link in something that reads the stack, and call it on an interrupt basis. Also, there are tools like **lsstack** and **jstack**. There ought to be something like that for .net. – Mike Dunlavey Jan 18 '11 at 14:53
  • @Mike I updated the question, no IDE or profiler. – dr. evil Jan 18 '11 at 16:47

3 Answers3

4

I assume this will be always a same for the same piece of code in the same computer and not effected by other processes

That's just not the way computers work. Code very much is affected by other processes running on the machine. A typical Windows machine has about 1000 active threads, you can see the number in the Performance tab of Taskmgr.exe. The vast majority of them are asleep, waiting for some kind of event signaled by Windows. Nevertheless, if the machine is running code, including yours, that is ready to go and take CPU time then Windows will give them all a slice of the pie.

Which makes measuring the amount of time taken by your code a pretty arbitrary measurement. The only thing you can estimate is the minimum amount of time taken. Which you do by running the test dozens of times, odds are decent that you'll get a sample that wasn't affected by other processes. That will however never happen in Real Life, you'd be wise to take the median value as a realistic perf measurement.

The only truely useful measurement is measuring incremental improvements to your algorithm. Change code, see how the median time changes because of that.

Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
  • So how profilers are doing it then? Am I wrong by expecting that there is a way to calculate how much of CPU spend on X? Windows doing this in thread and process level why I can't do it in code level? I know that code effected by other processes I assume everyone who uses computer knows that. And "incremental improvements" are not measuring . – dr. evil Jan 18 '11 at 16:49
  • Did you downvote the answers here? Not quite sure why you'd think that inspires people to try to guess what you really want. – Hans Passant Jan 18 '11 at 17:09
  • Yes I did, because your answer talks about basics of computer and doesn't answer the question (actually not even helping) – dr. evil Jan 18 '11 at 21:46
  • I thought idea of SO is downvoting and upvoting based on the quality of the answer. Hence I voted as I thought, looks like some other people think different which means SO is working. – dr. evil Jan 18 '11 at 21:46
1

Basically how much cycle CPU spent. I assume this will be always a same for the same piece of code in the same computer and not effected by other processes. There might be some fundamental stuff I'm missing in here, if so please enlighten me in the comments or answers.

CPU time used by a function is a really squishy concept.

  • Does it include I/O performed anywhere beneath it in the call tree?
  • Is it only "self time" or inclusive of callees? (In serious code, self time is usually about zero.)
  • Is it averaged over all invocations? Define "all".
  • Who consumes this information, for what purpose? To find so-called "bottlenecks"? Or just some kind of regression-tracking purpose?

If the purpose is not just measurement, but to find code worth optimizing, I think a more useful concept is Percent Of Time On Stack. An easy way to collect that information is to read the function call stack at random wall-clock times (during the interval you care about). This has the properties:

  • It tells inclusive time percent.
  • It gives line-level (not just function-level) percent, so it pinpoints costly lines of code, whether or not they are function calls.
  • It includes I/O as well as CPU time. (In serious software it is not wise to exclude I/O time, because you really don't know what's spending time several layers below in the call tree.)
  • It is relatively insensitive to competition for the CPU by other processes, or to CPU speed.
  • It does not require a high sample rate or a large number of samples to find costly code. (This is a common misconception.) Each additional digit of measurement precision requires roughly 100 times more samples, but that does not locate the costly code any more precisely.

A profiler that works on this principle is Zoom.

On the other hand, if the goal is simply to measure, so the user can see if changes have helped or hurt performance, then the CPU environment needs to be controlled, and simple overall time measurement is what I'd recommend.

Community
  • 1
  • 1
Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135
  • Mike thanks for the answer but you are not answering my question. 1) I didn't say I want to find places to optimise or costly code because that's not what I'm doing. 2) I/O is irrelevant in my case, because we don't track code with IO access. 3) I don't want to compare a CPU to another CPU, as stated in the question "same system/CPU" 4) I'm not sure why Zoom linked in there since it's Linux based and this is a .NET question and I noted in the question that I won't be using a profiler. – dr. evil Jan 18 '11 at 16:25
  • @dr. evil: Well I guess I was trying to figure out just how specific you needed to be, because the difference between time and percent is pretty fundamental. (I only included Zoom as an example of the principle, regardless of platform.) Sorry if I missed your point. (BTW, I personally seldom use downvotes, because it's a bit of a rap on the knuckles for someone who's trying to be helpful.) – Mike Dunlavey Jan 18 '11 at 16:48
  • Mike I downvoted because I really don't like where SO is going with this people answering something else answers. I know everyone is trying to help otherwise they wouldn't answer. But answering something else and not the question is quite popular in SO and a terrible trend at the same time. So I downvote all these kind of answers, whether it's my question or someone else's – dr. evil Jan 18 '11 at 18:21
  • @dr. evil: I understand. I wish there were a flag on a question indicating this preference, because many questions simply beg to be answered that way, and are appreciated. (Witness all the micro-opt questions.) When you said "fundamental stuff I'm missing" I took that as being open. No matter. I hope you get a good answer. – Mike Dunlavey Jan 18 '11 at 19:45
  • Yep, such a preference would be really nice as there are lots of question where the poster is actually on the wrong path and asking for something else hence it makes sense to tell them something else. Also I noticed that I should have written the question better. To be honest I tried revert my vote but apparently it's already locked. Regardless, thanks for the help. Sometimes I get offended by answers like this :) I guess I forget that there is no way for people who answer to know what I know and what I don't. – dr. evil Jan 18 '11 at 21:44
1

The hands-down best way to measure CPU time is to use the instruction "rdtsc" or "Read Time Stamp Counter". This counter (part of the CPU itself) increments by the CPU's internal clock speed. So the difference between two readouts is the number of elapsed clock cycles. This counter can be integrated into your code if it (the code) is not too high-level (not quite sure though). You can measure time on disk, time on network, time in CPU etc - the possibilities are endless. If you divide the number of elapsed clock cycles with your CPU-speed in megahertz you will get the elapsed number of microseconds, for example. That's pretty good precision and better is possibleConsider building a GUI which interfaces to your CPU-usage statistics.

Search for "rdtsc" or "rdtsc" or "_rdtsc" in the help files for your environment.

Olof Forshell
  • 3,169
  • 22
  • 28
  • From what I know this one has even more syncing issues with multi core CPUs than `QueryPerformanceCounter`. And then there is fun stuff like the CPU speed not being constant over time... – CodesInChaos Feb 13 '11 at 16:23
  • RDTSC is not new to stackoverflow and neither are the issues. So to recap: syncing is an issue on older multi-core AMD processors, not an issue on later ones. To achieve constant CPU speed you need to turn off power saving such as "Cool'nQuiet" on AMD and whatever the equivalent is on intel as they vary the CPU speed to conserve power. – Olof Forshell Feb 17 '11 at 08:47