24

I want to measure time inner kernel of GPU, how how to measure it in NVIDIA CUDA? e.g.

__global__ void kernelSample()
{
  some code here
  get start time 
  some code here 
  get stop time 
  some code here
}
talonmies
  • 70,661
  • 34
  • 192
  • 269
Amin
  • 371
  • 1
  • 2
  • 7
  • 1
    Amin, (a few years ago) you accepted an answer which is useful information but doesn't answer your actual question. Would you mind either un-accepting or editing the question to reflect the answer maybe? – einpoklum Feb 11 '17 at 00:32

2 Answers2

49

You can do something like this:

__global__ void kernelSample(int *runtime)
{
  // ....
  clock_t start_time = clock(); 
  //some code here 
  clock_t stop_time = clock();
  // ....

  runtime[tidx] = (int)(stop_time - start_time);
}

Which gives the number of clock cycles between the two calls. Be a little careful though, the timer will overflow after a couple of seconds, so you should be sure that the duration of code between successive calls is quite short. You should also be aware that the compiler and assembler do perform instruction re-ordering so you might want to check that the clock calls don't wind up getting put next to each other in the SASS output (use cudaobjdump to check).

talonmies
  • 70,661
  • 34
  • 192
  • 269
  • I test and Not work properly. clock() is host function, not device function. – Amin May 15 '12 at 14:38
  • 7
    `clock()` **is** a device function and it really works. See section B10 of the CUDA programming guide for a description of `clock()` and `clock64()`. If it "not work properly" you are doing something wrong or have misunderstood what the output means. – talonmies May 15 '12 at 14:41
  • The output is clock cycle. How to convert to seconds? – Amin May 15 '12 at 16:03
  • 2
    I *did* say it was a clock cycle in my answer. Divide it by the shader clock frequency in kilohertz to get an answer in milliseconds (note that unless you have a new Kepler GPU your GPU has two clock speeds, you need the shader clock, not the main clock frequency) – talonmies May 15 '12 at 17:10
  • 3
    @Amin: if this solved your problem perhaps you would be so kind as to accept it. – talonmies May 16 '12 at 03:18
  • @talonmies How to get the shader clock frequency in kilohertz? Using CUDA device properties? – ericmoraess Apr 30 '14 at 11:34
  • There's an example of converting clock64() to seconds here: https://stackoverflow.com/a/43010589/2189500 – David Wohlferd Jun 02 '17 at 09:52
0

Try this, it measures time between 2 events in milliseconds.

  cudaEvent_t start, stop;
  float elapsedTime;

  cudaEventCreate(&start);
  cudaEventRecord(start,0);

 //Do kernel activity here

 cudaEventCreate(&stop);
 cudaEventRecord(stop,0);
 cudaEventSynchronize(stop);

 cudaEventElapsedTime(&elapsedTime, start,stop);
 printf("Elapsed time : %f ms\n" ,elapsedTime);
edocetirwi
  • 542
  • 5
  • 22