12

I've got something like this:

clock_t start, end;
start=clock();

something_else();

end=clock();
printf("\nClock cycles are: %d - %d\n",start,end);

and I always get as an output "Clock cycles are: 0 - 0"

Any idea why this happens?

(Just to give little detail, the something_else() function performs a left-to-right exponentiation using montgomery representation, moreover I don't know for certain that the something_else() function does indeed take some not negligible time.)

This is on Linux. The result of uname -a is:

Linux snowy.*****.ac.uk 2.6.32-71.el6.x86_64 #1 SMP Fri May 20 03:51:51 BST 2011 x86_64 x86_64 x86_64 GNU/Linux

eddy ed
  • 917
  • 2
  • 10
  • 21

7 Answers7

11

clock function does not measure CPU clock cycles.

C says clock "returns the implementation’s best approximation to the processor time used by the program since the beginning of an implementation-defined era related only to the program invocation."

If between two successive clock calls you program takes less time than one unity of the clock function, you could get 0.

POSIX clock defines the unity with CLOCKS_PER_SEC as 1000000 (unity is then 1 microsecond).

http://pubs.opengroup.org/onlinepubs/009604499/functions/clock.html

To measure clock cycles in x86/x64 you can use inline assembly to retreive the clock count of the CPU Time Stamp Counter register rdtsc.

ouah
  • 142,963
  • 15
  • 272
  • 331
  • 2
    What you say about the time stamp counter sounds really interesting. Can you point out a good source where to read about? Thanks a lot – eddy ed Mar 26 '12 at 18:23
  • 1
    Also, it is useful to note that while clock() is a function in the C90 standard, there are platforms where clock() is technically unsupported and is implemented to always return -1. –  Mar 09 '16 at 00:41
11

I guess the reason is that your something_else() consumes so little time that exceed the precision of clock(). I tried calling clock() twice consequently and both start and end is zero, but result is reasonable when I do some time-consuming stuff between.

Here is my test code snippet:

int main(void) {   
    clock_t start, end;
    start = clock();
    int c;
    for (int i = 0; i < 100; i++) {
        for (int j = 0; j < (1<<30); j++) {
            c++;
        }
    }
    end = clock();
    printf("start = %d, end = %d\n", start, end);
    return 0;
}

And the result on my computer is:

start = 0, end = 27700000

Also, two tips:

  1. When testing, do not use any compiler optimization. You may think your something_else() is time-consuming but the compiler may just ignore those operations (especially loops) since it think them as meaningless.
  2. Use sizeof(clock_t) on your platform to see the size of clock_t.
Divine1
  • 3
  • 2
Jinghao Shi
  • 1,077
  • 2
  • 10
  • 15
  • I tried your code on my system. It takes several seconds to run, but still I get two Zeroes. I also tried to print it as printf("\nTime elapsed: %.2f\n",1.0*(end-start)/CLOCKS_PER_SEC); but I still get zero. There's definitely something wrong with the clock function. Is there any flag I need to set or anything that any of you is aware of? – eddy ed Mar 26 '12 at 16:49
  • @eddyed: `start` should be zero since it records the elapsed time since program start, but `end` should NOT be zero. To my knowledge, no special flags or any other tricks are needed for using `clock()`. BTW, what's the result of `sizeof(clock_t)` on your platform? – Jinghao Shi Mar 27 '12 at 06:39
  • 2
    -1 for *don't optimize* (no, i didnt actually downvote).If you want to prevent the function call optimized out, simply use the function return value to something, for example print it out in the end of the program. – Rookie Aug 19 '13 at 12:35
  • 1
    One doesn't generally benchmark unoptimized code. ;) But seriously, the given example code isn't really benchmarkable for two reasons. Any results or side-effects of the computation are not used and the compiler can just remove them. Even if c is used later, the compiler can precalculate the final value without generating any loops. If one wanted to test if clock() works, just call sleep(1). –  Mar 09 '16 at 00:36
6

Well, do you want the time something_else() takes? Try this:

#include <sys/time.h>
#include <stdio.h>  
#include <unistd.h>
int main(void) {
    struct timeval start, end;
    long mtime, secs, usecs;    

    gettimeofday(&start, NULL);
    something_else();
    gettimeofday(&end, NULL);
    secs  = end.tv_sec  - start.tv_sec;
    usecs = end.tv_usec - start.tv_usec;
    mtime = ((secs) * 1000 + usecs/1000.0) + 0.5;
    printf("Elapsed time: %ld millisecs\n", mtime);
    return 0;
}
Michał Walenciak
  • 4,257
  • 4
  • 33
  • 61
Marcos
  • 4,643
  • 7
  • 33
  • 60
  • Unluckily I need clock cycles, not the time. – eddy ed Mar 26 '12 at 11:30
  • 2
    I don't think you'll be able to portably measure clock cycles. I don't think there's anything about that in the C or POSIX standard. – Guido Mar 26 '12 at 11:34
  • 1
    @eddy ed, if you want to measure clock cycles you need to use a logic analyzer, frequency counter, or other hardware measurement tool attached to your hardware. – mah Mar 26 '12 at 11:36
  • @mah: Or a hardware counter, such as as the TSC on modern x86s. – Oliver Charlesworth Mar 26 '12 at 14:37
  • @WindChaser That's the correct way to round the floating point computed value to the closest integer. – jlliagre Feb 29 '16 at 07:34
  • + 0.5 is for rounding to nearest integer. Look: (int)2.2 => 2 - ok, (int)2.5 => 2 - bad, (int)2.7 => 2 - bad, but: (int)(2.2 + 0.5) => 2 - good, (int)(2.5 + 0.5) => 3 - good, because values >= 2.5 need to be rounded up, (int)(2.7 + 0.5) => 3 - good too. Clear? ;) – VillageTech Dec 03 '19 at 20:45
2

Check the value of CLOCKS_PER_SEC in time.h/clock.h. On my system, for example, ( Dev Cpp on Windows 7 ) its a mere 1000. So as far as my program is concerned, there are 1000 ticks per second. Your something_else would be executed in a matter of microseconds. And hence clock() returns zero both before and after the function call.

On my system, when I replace your something_else with a time consuming routine like this

for (unsigned i=0xFFFFFFFF;i--;);

start=clock();

for (unsigned i=0xFFFFFFFF;i--;);

end=clock();

I get

Clock cycles are: 10236 - 20593

On one of linux boxes, I find the following in bits/time.h

/* ISO/IEC 9899:1990 7.12.1: <time.h>
   The macro `CLOCKS_PER_SEC' is the number per second of the value
   returned by the `clock' function. */
/* CAE XSH, Issue 4, Version 2: <time.h>
   The value of CLOCKS_PER_SEC is required to be 1 million on all
   XSI-conformant systems. */
#  define CLOCKS_PER_SEC  1000000l

So do consider this before analyzing the return value of clock()

Pavan Manjunath
  • 27,404
  • 12
  • 99
  • 125
2

The right way of using clock() to measure time would be:

printf("\nTime elapsed: %.2f\n",1.0*(end-start)/CLOCKS_PER_SEC);

This is because clock_t isn't guaranteed to be an int, or any other type for that matter.

Guido
  • 2,571
  • 25
  • 37
  • The OP is interested in knowing why `start` and `end` are turning out to be 0? In that case, floating point doesn't help – Pavan Manjunath Mar 26 '12 at 12:23
  • What I mean is that they might not be of type int, so printing them makes no sense. The proper way to use them is substracting two of the amounts and dividing by CLOCKS_PER_SEC. Pretty much anything else is undefined. The code I gave works properly showing the amount of seconds elapsed. – Guido Mar 26 '12 at 12:44
1

I have used the little program below to investigate wall clock time and CPU time.

On my test sytem this prints

CLOCKS_PER_SEC 1000000

CPU time usage resolution looks to be 0.010000 seconds

gettimeofday changed by 9634 uSwhen CPU time changed by 0.010000

gettimeofday resolution looks to be 1 us

#include <stdio.h>
#include <unistd.h>
#include <sys/time.h>
#include <ctime>


int main(int argc, char** argv) {

    struct  timeval now; // wall clock times
    struct  timeval later;

    clock_t tNow = clock(); // clock measures CPU time of this Linux thread
    gettimeofday(&now, NULL); // wall clock time when CPU time first read

    clock_t tLater = tNow;
    while (tNow == tLater)
           tLater = clock(); // consume CPU time

    gettimeofday(&later, NULL); // wall clock time when CPU time has ticked

    printf("CLOCKS_PER_SEC %ld\n",CLOCKS_PER_SEC);

    double cpuRes = (double)(tLater - tNow)/CLOCKS_PER_SEC;

    printf("CPU time usage resolution looks to be %f seconds\n", cpuRes);

    unsigned long long nowUs = ((unsigned long long)now.tv_sec) * 1000000ULL;
    nowUs += (unsigned long long)now.tv_usec;

    unsigned long long laterUs = ((unsigned long long)later.tv_sec) * 1000000ULL;
    laterUs += (unsigned long long)later.tv_usec;

    printf("gettimeofday changed by %d uS when CPU time changed by %f seconds\n", (int)(laterUs - nowUs), cpuRes);

    // now measure resolution of gettimeofday

    gettimeofday(&now, NULL);
    later = now;

    while ((now.tv_sec  == later.tv_sec) && (now.tv_usec == later.tv_usec))
            gettimeofday(&later, NULL);

    nowUs = ((unsigned long long)now.tv_sec) * 1000000ULL;
    nowUs += (unsigned long long)now.tv_usec;

    laterUs = ((unsigned long long)later.tv_sec) * 1000000ULL;
    laterUs += (unsigned long long)later.tv_usec;

    printf("gettimeofday resolution looks to be %d us\n", (int)(laterUs - nowUs));

}
Sudheer
  • 2,955
  • 2
  • 21
  • 35
1

I encountered this same issue while trying to time the difference between a generic class and a non-generic class both using a vector, on Red Hat Linux with C++ and g++ compiler. It appears if your program runs slower than a single clock, the clock() reading will always be zero (0).

This code will always return 0

#include <iostream>
#include <ctime>

using namespace std;

int main() {

    cout << clock() << endl;

    return 0;
}

When I added a for loop with an index up to ten million, to slow the program down, then I got a number 20000 as the result from clock()

#include <iostream>
#include <ctime>

using namespace std;

int main() {

    for (int i = 0; i < 10000000; i++) {}
    cout << clock() << endl;

    return 0;
}

Certainly depending on the stats of your box, the results will vary, I am running this code with multi-processor Xeon CPU's and also a huge amount of RAM.