2

I am profiling Direct3D9 API calls. I have read a lot of documentation here that describes the process. I have a question though about calculating the elapsed clock cycles. Here is my current method:

// measurement vars
LARGE_INTEGER start, stop, freq;

//
// flush command buffer here
//

// 
// issue begin query here
//

// start timer
QueryPerformanceCounter(&start);

//
//draw
//

//
// issue end query here and wait on results
//

// stop timer
QueryPerformanceCounter(&stop);

// calc elapsed ticks
stop.QuadPart -= start.QuadPart;

// get frequency
QueryPerformanceFrequency(&freq);

// ticks for easier handling
ULONG ticks = stop.QuadPart;

// calc elapsed clock cycles
cycles = (2.8E9 * ticks) / (double)freq.QuadPart;

My question concerns the value 2.8E9 which is supposed to represent the speed of the processor. Is this the correct way of calculating clock cycles? I am profiling single API calls and my results differ from those found on the above link. If I set the processor speed to 1E9 then the numbers are within range...I just wanted to check my method...

P. Avery
  • 779
  • 1
  • 16
  • 34

1 Answers1

-1

Quoting your given source, the formula you'd use is:

cycles = CPU speed * number of ticks / QPF

In fact you're doing exactly this in the last line of your example code, where the "magic number" 2.8E9 is the speed of your CPU in Hz. So if this is your question, the answer is Yes, you're using the right formula.

In case you want to replace your magic number by the actual CPU speed, be prepared for a good amount of work. For more information on this, see the answer Finding out the CPU clock frequency (per core, per processor)


However in order to get reliable profiling results, the best method would be using an external profiling tool, because when you change your code to measure execution times, it may actually have an effect on the things you want to measure (e.g. regarding optimizations).

Check the wiki for some reasonable profiling tools: http://en.wikipedia.org/wiki/List_of_performance_analysis_tools#C_and_C.2B.2B

Apart from your interest in CPU cicles, the better "intrusive" approach would be using the platform independent mechanism from the C++11 Standard library for measuring elapsed times: http://en.cppreference.com/w/cpp/chrono/high_resolution_clock/now

Community
  • 1
  • 1
  • I'm interested in understanding how clock cycles are calculated – P. Avery Mar 27 '15 at 21:24
  • Clock cycles cannot be calculated, they're counted as the term _QueryPerformanceCounter_ states. There is no simple way to evaluate the amount of work the individual hardware components do for your processing. Modern computers/OSs are build to hide exactly this. Also the linked article says that it just uses this method because "it is available in Windows and it is easy to use" to "profile **execution times**". My answer simply proposes better ways to achieve this. Sorry for possibly misinterpreting your question, but I don't see any additional info to give here. – Stefan Gränitz Mar 30 '15 at 10:11
  • I'm interested in understanding the formula used to calculate the number of cycles that have elapsed when given 2 timestamps...The formula in the question above says to multiply ticks by processor speed in hz and divide by the high res timer's frequency. Is that correct? – P. Avery Mar 31 '15 at 23:52