clock_gettime() Vs. gettimeofday() for measuring OpenMP execution time

Question

I am working on some C code that is implementing a triple nested for loop to calculate matrix-matrix multiplication while parallelizing it using OpenMP. I am trying to accurately measure the amount of time it takes from when the for loop starts to when it ends. I have been using gettimeofday() so far but I noticed that sometimes it did not feel like it was accurately recording the amount of time it took for the for loop to execute. It seemed as if it was saying that it is took longer than it actually did.

Here is the original code:

struct timeval start end;
double elapsed;

gettimeofday(&start, NULL);
#pragma omp parallel for num_threads(threads) private(i, j, k)
for(...)
{
 ...
 for(...)
 {
  ...
  for(...)
  {
   ...
  }
 }
}

gettimeofday(&end, NULL);
elapsed = (end.tv_sec+1E-6*end.tv_usec) - (start.tv_sec+1E-6*start.tv_usec)

And here is the same code using clock_gettime():

 struct timespec start1, finish1;
 double elapsed1;

clock_gettime(CLOCK_MONOTONIC, &start1);

  #pragma omp parallel for num_threads(threads) private(i, j, k)
    for(...)
    {
     ...
     for(...)
     {
      ...
      for(...)
      {
       ...
      }
     }
    }

clock_gettime(CLOCK_MONOTONIC, &finish1);
elapsed1 = (finish1.tv_sec - start1.tv_sec);
elapsed1 += (finish1.tv_nsec - start1.tv_nsec)/1000000000.0;

The loop takes 3-4 seconds to complete and I tried using both time measurements at the same time and the result from using gettimeofday() was almost always longer than the result from clock_gettime() and sometimes was over a second longer than the result I was getting using clock_gettime():

struct timespec start1, finish1;
double elapsed1;

struct timeval start end;
double elapsed;

clock_gettime(CLOCK_MONOTONIC, &start1);
gettimeofday(&start, NULL);

  #pragma omp parallel for num_threads(threads) private(i, j, k)
    for(...)
    {
     ...
     for(...)
     {
      ...
      for(...)
      {
       ...
      }
     }
    }

gettimeofday(&end, NULL);
clock_gettime(CLOCK_MONOTONIC, &finish1);

elapsed = (end.tv_sec+1E-6*end.tv_usec) - (start.tv_sec+1E-6*start.tv_usec)

elapsed1 = (finish1.tv_sec - start1.tv_sec);
elapsed1 += (finish1.tv_nsec - start1.tv_nsec)/1000000000.0;

Is there a reason for this? What could be causing a difference when using these two functions? I'm trying to gain a better understanding of the nature of these two functions.

I also ran the code using the two time functions separately and noticed the time difference then as well. — Shaun Holtzman, Sep 26 '16 at 19:17
Posting how code reported the difference would help. Especially as in how much longer. — chux - Reinstate Monica, Sep 26 '16 at 21:00
Please post a [mcve] and the specific results for that code, not just your summary. — Zulan, Sep 27 '16 at 06:45

score 5 · Answer 1 · edited May 23 '17 at 12:32

5

elapsed = (end.tv_sec+1E-6*end.tv_usec) - (start.tv_sec+1E-6*start.tv_usec) is prone to precision lost when subtracting like values that are both large.

Use elapsed = (end.tv_sec - start.tv_sec) - (start.tv_usec- end.tv_usec)/1E6. This is like OP's 2nd and 3rd code, but not first.

Be fair, get times in a consistent order to remove bias.

clock_gettime(CLOCK_MONOTONIC, &start1);
gettimeofday(&start, NULL);

...

// gettimeofday(&end, NULL);
// clock_gettime(CLOCK_MONOTONIC, &finish1);
clock_gettime(CLOCK_MONOTONIC, &finish1);
gettimeofday(&end, NULL);

Minor: A 3rd, though very subtle improvement that helps reduce inconsistency a little bit (0.5 usec), is to start the test on a tick change. But note @Dietrich Epp comment for an alternative improvement.
```
gettimeofday(&t, NULL);
do { 
  gettimeofday(&start, NULL);
} while (start == t);
```

Alternatively, use wide integer math to avoid precision issues

long long elapsed_ns = (1LL*finish1.tv_sec - start1.tv_sec)*1000000000LL + 
    finish1.tv_nsec - start1.tv_nsec;

edited May 23 '17 at 12:32

Community

1
1

answered Sep 26 '16 at 19:39

chux - Reinstate Monica

143,097
13
135
256

1

It seems like that trick with the tick change is going to take the uniform ±0.5µsec error, which can be reduced with multiple runs, and replace it with a quantization error, which cannot be reduced with multiple runs. – Dietrich Epp Sep 26 '16 at 19:50
On the other hand, 0.5 µsec is surely well below the noise threshold for this measurement. – John Bollinger Sep 26 '16 at 20:17
@Dietrich Epp Good point. So for few runs, it retains some value, but as you identified, other valuable techniques could be used with repetitive runs. – chux - Reinstate Monica Sep 26 '16 at 20:50
2

How does this address the question of >1 s difference? – Zulan Sep 27 '16 at 06:42
@Zulan If OP is using a `double` of less precision than commonly employed by many platforms, the order of calculation noted in #1 is significant. IMO, something is amiss with that calculation and it deserves review. – chux - Reinstate Monica Sep 27 '16 at 12:36
@chux. The C standard specifically requires the IEC 60559 double format with 53 bits of precision. For this effect to be relevant in this particular context, `double` would have to have less than 42 bits of precision. That said, YES you are right for general computation, but for the next 200 millenia or so `(end.tv_sec+1E-6*end.tv_usec)` is perfectly fine. – Zulan Sep 27 '16 at 13:48
@Zulan C references but does not require IEC 60559 double format although that format is commonly used. `end.tv_sec+1E-6*end.tv_usec` presently needs about 31 (assuming `time_t` is in seconds since 1/1/1970) +20 bits. 2^54 (limit of `double` 60559 precision) will begin to be an issue about 8*140 years. Yet OP did not detail how the seconds value was compared. AKAIK, unposted code was doing the compare in ns rendering time calculation inexact. It appears OP did have an issues with time calculation, but was a [mis-coded calculation](http://stackoverflow.com/a/39723923/2410359). – chux - Reinstate Monica Sep 27 '16 at 14:30
Apologies, I was mistakenly using, milliseconds. – Zulan Sep 27 '16 at 17:06

score 2 · Answer 2 · answered Sep 27 '16 at 11:50

Thanks for the help everyone. Turns out the problem didn't have to do with the time functions but instead was due to not properly putting parentheses when calculating the difference between the end time and the start time. I know, kind of an anticlimactic and dumb solution but it fixed my problem. When I was using gettimeofday() I was doing this to calculate the result:

elapsed = end.tv_sec+1E-6*end.tv_usec - start.tv_sec+1E-6*start.tv_usec

When I should have been doing this:

elapsed = (end.tv_sec+1E-6*end.tv_usec) - (start.tv_sec+1E-6*start.tv_usec)

The code I am using was written by someone else to use the gettimeofday() function and they had these #define's:

#define TIME_GET(time) (time).tv_sec+1E-6*(time).tv_usec
#define TIME_GET_RESULT(start,end) TIME_GET(end)-TIME_GET(start)

Changing the first #define by adding parenthesis fixed my issue:

#define TIME_GET(time) ((time).tv_sec+1E-6*(time).tv_usec)

When I started to use clock_gettime() I was putting the parenthesis in correctly and just didn't notice that the #define that the author of the code had for gettimeofday() was not.

That's why creating a [mcve] is so important when asking questions. Also, [don't write functions as macros](http://stackoverflow.com/a/9104712/620382). — Zulan, Sep 27 '16 at 13:50

clock_gettime() Vs. gettimeofday() for measuring OpenMP execution time

2 Answers2