279

I have a C program that aims to be run in parallel on several processors. I need to be able to record the execution time (which could be anywhere from 1 second to several minutes). I have searched for answers, but they all seem to suggest using the clock() function, which then involves calculating the number of clocks the program took divided by the Clocks_per_second value.

I'm not sure how the Clocks_per_second value is calculated?

In Java, I just take the current time in milliseconds before and after execution.

Is there a similar thing in C? I've had a look, but I can't seem to find a way of getting anything better than a second resolution.

I'm also aware a profiler would be an option, but am looking to implement a timer myself.

Thanks

Mohan
  • 1,871
  • 21
  • 34
Roger
  • 3,411
  • 5
  • 23
  • 22
  • 4
    what OS/API frameworks are you using/available? Just plain C? – typo.pl Mar 09 '11 at 16:37
  • 5
    It's a rather small program, just plain C – Roger Mar 09 '11 at 16:39
  • 2
    I've written in details about implementing a portable solution in this answer: http://stackoverflow.com/questions/361363/how-to-measure-time-in-milliseconds-using-ansi-c/37920181#37920181 – Alexander Saprykin Jun 27 '16 at 13:53
  • 1
    time taken to execute a complete function http://stackoverflow.com/a/40380118/6180077 – Abdullah Farweez May 17 '17 at 05:05
  • sorry the votes was "256" *(perfect number.. **️**) and here I come to vote it up to 257.. https://en.wikipedia.org/wiki/256_(number)#In_computing – William Martens Jul 17 '22 at 20:44
  • Related: [Idiomatic way of performance evaluation?](https://stackoverflow.com/q/60291987) - benchmarking is hard, especially meaningful *micro*-benchmarking of a single function or loop. Warm-up effects, and the necessity of enabling optimization but without having the important work optimized away or hoisted/sunk out of loops. – Peter Cordes Aug 19 '22 at 11:32

18 Answers18

421

CLOCKS_PER_SEC is a constant which is declared in <time.h>. To get the CPU time used by a task within a C application, use:

clock_t begin = clock();

/* here, do your time-consuming job */

clock_t end = clock();
double time_spent = (double)(end - begin) / CLOCKS_PER_SEC;

Note that this returns the time as a floating point type. This can be more precise than a second (e.g. you measure 4.52 seconds). Precision depends on the architecture; on modern systems you easily get 10ms or lower, but on older Windows machines (from the Win98 era) it was closer to 60ms.

clock() is standard C; it works "everywhere". There are system-specific functions, such as getrusage() on Unix-like systems.

Java's System.currentTimeMillis() does not measure the same thing. It is a "wall clock": it can help you measure how much time it took for the program to execute, but it does not tell you how much CPU time was used. On a multitasking systems (i.e. all of them), these can be widely different.

syb0rg
  • 8,057
  • 9
  • 41
  • 81
Thomas Pornin
  • 72,986
  • 14
  • 147
  • 189
  • 1
    It gives me very random result - I get a mixture of large/small/negative number over the same piece of code. GCC 4.7 Linux 3.2 AMD64 –  Jun 02 '13 at 01:40
  • 1
    this gives the time in seconds? – cristi.gherghina Nov 07 '15 at 09:48
  • 3
    Yes: `clock()` returns a time in some internal scale called "clocks", and `CLOCKS_PER_SEC` is the number of clocks per second, so dividing by `CLOCKS_PER_SEC` yields a time in seconds. In the code above, the value is a `double` so you can scale it at will. – Thomas Pornin Nov 07 '15 at 16:56
  • It is weird, I have some results like: `0.256.322` which have two points!!! Although my machine and my os are good enough (Core i 5 with Win7), however it is not precise at all. – Hosein Aqajani Mar 02 '16 at 05:15
  • Does this measurement account for dynamic frequency scaling? Otherwise I'm assuming `clock()` uses a real-time clock, not clock cycles? – Andrew McKinlay Mar 07 '16 at 02:36
  • 37
    Big warning: clock() returns the amount of time the OS has spent running your process, and not the actual amount of time elapsed. However, this is fine for timing a block of code, but not measuring time elapsing in the real world. –  Mar 28 '16 at 18:31
  • Best accuracy can be achieved using getrusage() – ldanko May 17 '16 at 22:42
  • 4
    He said he wants to measure a multi-threaded program. I'm not sure a clock() is suitable for this, because it sums up running times of all threads, so the result will look like if the code was run sequentially. For such things i use omp_get_wtime(), but of course i need to make sure, the system is not busy with other processes. – Youda008 Oct 15 '16 at 08:12
  • 1
    I should mention some things even though this thread was more relevant a year ago: `CLOCKS_PER_SEC` is a `long int` with the value `1000000`, giving time in microseconds when not divided; not CPU clock cycles. Therefore, it doesn't need to account for dynamic frequency as the clock here is in microseconds (maybe clock cycles for a 1 MHz CPU?) I made a short C program printing that value and it was 1000000 on my i7-2640M laptop, with dynamic frequency allowing 800 MHz to 2.8 GHz, even using Turbo Boost to go as high as 3.5 GHz. – DDPWNAGE Aug 17 '17 at 00:32
  • @Youda008 comment should be upvoted, and probably be added to the answer. – Alberto Jul 21 '18 at 14:29
  • question: the number of clock per second is not a constant for a modern CPU, right? it changes following the needs of the OS. It goes down to save energy and goes up when the OS is asking more computational power, isn't it? – Leos313 Aug 19 '22 at 11:20
  • 1
    @Leos313: `CLOCKS_PER_SEC` is for a software clock incremented by timer interrupts, or the result of further processing on that. e.g. as DDPWNAGE commented, `clock()` is often just microseconds. It's totally unrelated to actual CPU frequency or core clock cycles on modern systems. For that, use Linux `perf stat ./a.out` which by default counts the `cycles` hardware event, e.g. `cpu_clk_unhalted.thread` on recent Intel CPUs. As opposed to the `rdtsc` "reference clock" that's actually fixed frequency and non-halting even when the CPU goes into a sleep state. – Peter Cordes Aug 19 '22 at 11:36
137

If you are using the Unix shell for running, you can use the time command.

doing

$ time ./a.out

assuming a.out as the executable will give u the time taken to run this

S..K
  • 1,944
  • 2
  • 14
  • 16
  • 4
    @acgtyrant but only for simple programs, because it'll take the whole program time, including input, output, etc. – phuclv Dec 17 '15 at 06:55
  • 2
    If you're on Linux, and you've reduced your (micro)benchmark to a program with negligible startup overhead, e.g. a static executable that runs your hot loop for a few seconds, you can use `perf stat ./a.out` to get HW performance counters for cache misses and branch mispredicts, and IPC. – Peter Cordes Apr 05 '19 at 08:12
  • 1
    See also [Idiomatic way of performance evaluation?](https://stackoverflow.com/q/60291987) for more about microbenchmarking pitfalls, e.g. you either need warm-up for the CPU and page faults, or you need a long enough repeat loop to amortize that startup. And you need to enable optimization but not have your real work optimized away or parts of it hoisted/sunk out of your repeat loop. – Peter Cordes Aug 19 '22 at 11:42
105

In plain vanilla C:

#include <time.h>
#include <stdio.h>

int main()
{
    clock_t tic = clock();

    my_expensive_function_which_can_spawn_threads();

    clock_t toc = clock();

    printf("Elapsed: %f seconds\n", (double)(toc - tic) / CLOCKS_PER_SEC);

    return 0;
}
richq
  • 55,548
  • 20
  • 150
  • 144
Alexandre C.
  • 55,948
  • 11
  • 128
  • 197
  • 46
    Best variable names I've seen in awhile. tic = "time in clock", toc = "time out clock". But also tic-toc = "tick-tock". This is how I'm labeling time grabs from here on out. – Logan Schelly Mar 30 '20 at 02:57
  • 8
    Note that `tic` and `toc` are the names of the standard stopwatch timer functions in MATLAB, used identically. Thus, I'm not sure if credit for originality is due, but that increases all the more their likelihood to be recognized and understood. – Cody Gray - on strike Mar 03 '22 at 08:37
  • 3
    @CodyGray Oh, I didn't know that. I saw those variable names somewhere, more than 10 years ago or so it seems :) I still use `tic` and `toc` in 2022, so next time I make colleagues wink in code reviews I can explain where this comes from :) – Alexandre C. Mar 03 '22 at 14:38
  • question: the number of clock per second is not a constant for a modern CPU, right? it changes following the needs of the OS. It goes down to save energy and goes up when the OS is asking more computational power, isn't it? – Leos313 Aug 19 '22 at 11:19
  • 1
    @Leos313: As [I replied to a duplicate of that comment under another answer](https://stackoverflow.com/questions/5248915/execution-time-of-c-program/46025887#comment129650590_5249150): the `clock()` function is unrelated to core clock cycles on modern systems; this is necessary for the same binaries to work on other systems; they all have to be compiled with the same CLOCKS_PER_SEC to match what libc `clock()` returns on different machines. – Peter Cordes Aug 19 '22 at 11:40
  • @Peter Cordes, thank you. I think your comment needs a new question and your answer. Do you agree? I think it is a crucial point and many people can have the same doubt – Leos313 Aug 19 '22 at 19:33
  • 1
    @Leos313: You mean like this existing one? [What's the relationship between the real CPU frequency and the clock\_t in C?](https://stackoverflow.com/q/70639349) – Peter Cordes Aug 19 '22 at 19:36
76

You functionally want this:

#include <sys/time.h>

struct timeval  tv1, tv2;
gettimeofday(&tv1, NULL);
/* stuff to do! */
gettimeofday(&tv2, NULL);

printf ("Total time = %f seconds\n",
         (double) (tv2.tv_usec - tv1.tv_usec) / 1000000 +
         (double) (tv2.tv_sec - tv1.tv_sec));

Note that this measures in microseconds, not just seconds.

Endre
  • 690
  • 8
  • 15
Wes Hardaker
  • 21,735
  • 2
  • 38
  • 69
  • 2
    MinGW compiler is GCC based. So it will work on it. But if you use visual C compiler, then you will get error. – user2550754 Jan 09 '14 at 11:37
  • 11
    Yes, it'll work on windows with a c library that supports the gettimeofday call. It actually doesn't matter what the compiler is, you just have to link it against a decent libc library. Which, in the case of mingw, is not the default windows one. – Wes Hardaker Jan 10 '14 at 18:22
  • 1
    This works for me on Windows XP with cygwin gcc & Linux Ubuntu. This is just what i wanted. – Love and peace - Joe Codeswell May 21 '15 at 02:20
  • 3
    `gettimeofday` is obsolete and not recommended for new code. Its POSIX man page recommends [clock_gettime](http://pubs.opengroup.org/onlinepubs/009696899/functions/clock_getres.html) instead, which lets you ask for `CLOCK_MONOTONIC` that isn't affected by changes to the system clock, and thus it's better as an interval time. (See [JohnSll's answer](https://stackoverflow.com/questions/5248915/execution-time-of-c-program/41959179#41959179)). On modern Linux systems, for example, gettimeofday is basically a wrapper for clock_gettime that converts nanoseconds to microseconds. – Peter Cordes Apr 05 '19 at 08:11
14

(All answers here are lacking, if your sysadmin changes the systemtime, or your timezone has differing winter- and sommer-times. Therefore...)

On Linux use: clock_gettime(CLOCK_MONOTONIC_RAW, &time_variable); It's not affected if the system-admin changes the time, or you live in a country with winter-time different from summer-time, etc.

#include <stdio.h>
#include <time.h>

#include <unistd.h> /* for sleep() */

int main() {
    struct timespec begin, end;
    clock_gettime(CLOCK_MONOTONIC_RAW, &begin);

    sleep(1);      // waste some time

    clock_gettime(CLOCK_MONOTONIC_RAW, &end);
    
    printf ("Total time = %f seconds\n",
            (end.tv_nsec - begin.tv_nsec) / 1000000000.0 +
            (end.tv_sec  - begin.tv_sec));

}

man clock_gettime states:

CLOCK_MONOTONIC

Clock that cannot be set and represents monotonic time since some unspecified starting point. This clock is not affected by discontinuous jumps in the system time (e.g., if the system administrator manually changes the clock), but is affected by the incremental adjustments performed by adjtime(3) and NTP.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
JohnSll
  • 141
  • 1
  • 2
  • Can you explain the calculation that you used to get the number of seconds? It is not obvious what's going on. – Colin Keenan Mar 30 '17 at 17:29
  • 1
    Wouldn't this `(end.tv_nsec - begin.tv_nsec) / 1000000000.0` result in `0` always? – alk Jul 21 '17 at 15:15
  • @alk: no, dividing by a `double` literal triggers int or `long` to `double` conversion *before* the division. Of course you could just stick to integer and print the `tv_sec` part and then the fractional part with zero like `%ld.%09ld`, but converting to double is easy and 53 bits of precision are usually plenty for benchmark times. – Peter Cordes Apr 05 '19 at 07:59
  • 3
    (Oops, the subtraction of the nanoseconds part may need to carry into the seconds part, so using double and letting it be negative avoids that problem. To use a pure integer format string, you'd need a `timespec_subtract` like the `timeval_subtract` suggested in the glibc manual: https://www.gnu.org/software/libc/manual/html_node/Elapsed-Time.html) – Peter Cordes Apr 05 '19 at 08:07
13

Most of the simple programs have computation time in milli-seconds. So, i suppose, you will find this useful.

#include <time.h>
#include <stdio.h>

int main(){
    clock_t start = clock();
    // Execuatable code
    clock_t stop = clock();
    double elapsed = (double)(stop - start) * 1000.0 / CLOCKS_PER_SEC;
    printf("Time elapsed in ms: %f", elapsed);
}

If you want to compute the runtime of the entire program and you are on a Unix system, run your program using the time command like this time ./a.out

erb
  • 14,503
  • 5
  • 30
  • 38
adimoh
  • 658
  • 2
  • 8
  • 20
  • In Windows at least the factor is at least 100 but not 1000 and it's not exact – boctulus Apr 16 '16 at 12:29
  • 6
    This answer doesn't add anything that wasn't in [Alexandre C](http://stackoverflow.com/users/373025/alexandre-c)'s [answer](http://stackoverflow.com/a/5249129/15168) from two year's earlier. – Jonathan Leffler Dec 05 '16 at 01:25
  • 6
    @boctulus: 1s is *always* 1000ms, also on windows. – alk Jul 21 '17 at 15:09
12

Thomas Pornin's answer as macros:

#define TICK(X) clock_t X = clock()
#define TOCK(X) printf("time %s: %g sec.\n", (#X), (double)(clock() - (X)) / CLOCKS_PER_SEC)

Use it like this:

TICK(TIME_A);
functionA();
TOCK(TIME_A);

TICK(TIME_B);
functionB();
TOCK(TIME_B);

Output:

time TIME_A: 0.001652 sec.
time TIME_B: 0.004028 sec.
hklel
  • 1,624
  • 23
  • 45
11

A lot of answers have been suggesting clock() and then CLOCKS_PER_SEC from time.h. This is probably a bad idea, because this is what my /bits/time.h file says:

/* ISO/IEC 9899:1990 7.12.1: <time.h>
The macro `CLOCKS_PER_SEC' is the number per second of the value
returned by the `clock' function. */
/* CAE XSH, Issue 4, Version 2: <time.h>
The value of CLOCKS_PER_SEC is required to be 1 million on all
XSI-conformant systems. */
#  define CLOCKS_PER_SEC  1000000l

#  if !defined __STRICT_ANSI__ && !defined __USE_XOPEN2K
/* Even though CLOCKS_PER_SEC has such a strange value CLK_TCK
presents the real value for clock ticks per second for the system.  */
#   include <bits/types.h>
extern long int __sysconf (int);
#   define CLK_TCK ((__clock_t) __sysconf (2))  /* 2 is _SC_CLK_TCK */
#  endif

So CLOCKS_PER_SEC might be defined as 1000000, depending on what options you use to compile, and thus it does not seem like a good solution.

Stephen
  • 523
  • 1
  • 6
  • 16
  • 1
    Thanks for the information but is there any better alternative yet? – ozanmuyes Oct 16 '14 at 21:00
  • 5
    This is not a pratical problem: yes Posix systems always have `CLOCK_PER_SEC==1000000`, but in the same time, they all use 1-µs precision for their clock() implementation; by the way, it has the nice property to reduce sharing problems. If you want to measure potentially very quick events, say below 1 ms, then you should first worry about the accuracy (or resolution) of the clock() function, which is necessarily coarser than 1µs in Posix, but is also often *much* coarser; the usual solution is to run the test many times; the question as asked did not seem to require it, though. – AntoineL Apr 22 '15 at 15:29
  • Why would it not be a good solution? You get some value from `clock()`, if you divide that value with `CLOCK_PER_SEC` you are guaranteed to get time in seconds cpu took. The responsibility of measuring actual clock speed is responsibility of `clock()` function, not yours. – Zaffy Aug 28 '19 at 12:23
  • No constant value compiled into a binary can count actual core clock cycles *and* be portable to other systems. Or even count cycles on a single system, since CPUs for about 2 decades (1 when this was written) have been able to vary their clock frequency to save power. It's just a convenient tick interval for measuring **CPU time**. If that's not what you want (e.g. wall-clock time), then use something else like `clock_gettime`. If you want core clock cycles, use `perf stat` for your microbenchmark. – Peter Cordes Feb 14 '23 at 06:01
  • See also [What's the relationship between the real CPU frequency and the clock\_t in C?](https://stackoverflow.com/q/70639349) – Peter Cordes Feb 14 '23 at 06:06
5
    #include<time.h>
    #include<stdio.h>
    int main(){
      clock_t begin=clock();

      int i;
      for(i=0;i<100000;i++){
        printf("%d",i);
      }
      clock_t end=clock();

      printf("Time taken:%lf",(double)(end-begin)/CLOCKS_PER_SEC);
    }

This program will work like charm.

Ravi Kumar Yadav
  • 193
  • 2
  • 15
4

ANSI C only specifies second precision time functions. However, if you are running in a POSIX environment you can use the gettimeofday() function that provides microseconds resolution of time passed since the UNIX Epoch.

As a side note, I wouldn't recommend using clock() since it is badly implemented on many(if not all?) systems and not accurate, besides the fact that it only refers to how long your program has spent on the CPU and not the total lifetime of the program, which according to your question is what I assume you would like to measure.

Endre
  • 690
  • 8
  • 15
Shinnok
  • 6,279
  • 6
  • 31
  • 44
  • ISO C Standard (assuming this is what *ANSI C* means) purposely does not specify the precision of the *time functions*. Then specifically on a POSIX implementation, or on Windows, precision of the *wall-clock* (see Thomas' answer) functions are in seconds. But clock()'s precision is usually greater, and always 1µs in Posix (independently of the accuracy.) – AntoineL Apr 22 '15 at 15:18
3

You have to take into account that measuring the time that took a program to execute depends a lot on the load that the machine has in that specific moment.

Knowing that, the way of obtain the current time in C can be achieved in different ways, an easier one is:

#include <time.h>

#define CPU_TIME (getrusage(RUSAGE_SELF,&ruse), ruse.ru_utime.tv_sec + \
  ruse.ru_stime.tv_sec + 1e-6 * \
  (ruse.ru_utime.tv_usec + ruse.ru_stime.tv_usec))

int main(void) {
    time_t start, end;
    double first, second;

    // Save user and CPU start time
    time(&start);
    first = CPU_TIME;

    // Perform operations
    ...

    // Save end time
    time(&end);
    second = CPU_TIME;

    printf("cpu  : %.2f secs\n", second - first); 
    printf("user : %d secs\n", (int)(end - start));
}

Hope it helps.

Regards!

Endre
  • 690
  • 8
  • 15
redent84
  • 18,901
  • 4
  • 62
  • 85
3

I've found that the usual clock(), everyone recommends here, for some reason deviates wildly from run to run, even for static code without any side effects, like drawing to screen or reading files. It could be because CPU changes power consumption modes, OS giving different priorities, etc...

So the only way to reliably get the same result every time with clock() is to run the measured code in a loop multiple times (for several minutes), taking precautions to prevent the compiler from optimizing it out: modern compilers can precompute the code without side effects running in a loop, and move it out of the loop., like i.e. using random input for each iteration.

After enough samples are collected into an array, one sorts that array, and takes the middle element, called median. Median is better than average, because it throws away extreme deviations, like say antivirus taking up all CPU up or OS doing some update.

Here is a simple utility to measure execution performance of C/C++ code, averaging the values near median: https://github.com/saniv/gauge

I'm myself still looking for a more robust and faster way to measure code. One could probably try running the code in controlled conditions on bare metal without any OS, but that will give unrealistic result, because in reality OS does get involved.

x86 has these hardware performance counters, which including the actual number of instructions executed, but they are tricky to access without OS help, hard to interpret and have their own issues ( http://archive.gamedev.net/archive/reference/articles/article213.html ). Still they could be helpful investigating the nature of the bottle neck (data access or actual computations on that data).

  • Yes, modern x86 CPUs idle much slower than max turbo. Depending on "governor" settings, ramp up to max clock speed might take a millisecond (Skylake with hardware P-state management, especially with energy_performance_preference set to `performance`) or many tens of milliseconds. https://en.wikipedia.org/wiki/Dynamic_frequency_scaling. And yes, median performance is usually a good choice; the high end usually has some spikes from interference. – Peter Cordes Aug 18 '19 at 22:50
  • Often your best bet to avoid having work optimize away is a command-line input and return the result. Or write a function in a separate file from `main` that takes an arg and returns a result, and don't use link-time optimization. Then the compiler can't inline it into the caller. Only works if the function already includes some kind of loop, otherwise call/ret overhead is too high. – Peter Cordes Aug 18 '19 at 22:52
  • Compiler can still optimize the single command line input out of the loop, if you process it with static code without any side effects. So it is best to generate a random input for each iteration. Obviously rand() should be called outside of measured code, before the first clock(), because rand() could as well result into a system call, sampling some hardware entropy generator (which on older systems was mouse movement). Just don't forget to printf every bit of the output, otherwise compiler may decide you don't need all the output as whole or part of it. That can be done with say CRC32. – SmugLispWeenie Aug 19 '19 at 08:50
  • If your code-under-test in in a separate file and you don't use link-time optimization, there's no way the compiler can do CSE to optimize between calls. The caller can't assume anything about the callee not having any visible side-effects. This lets you put something relatively short *inside* a repeat loop to make it long enough to time, with just call/ret overhead. If you let it inline, then you have to check the generated asm to make sure it didn't hoist a computation out of a loop as you say. – Peter Cordes Aug 19 '19 at 09:05
  • The compiler-specific way is to use (for example) GNU C inline asm to force a compiler to materialize a result in a register, and/or to forget what it knows about the value of a variable, without actually introducing extra instructions. ["Escape" and "Clobber" equivalent in MSVC](//stackoverflow.com/q/33975479) links to a video about profiling and microbenchmarking (clang developer Chandler Carruth's CppCon 2015 talk) There is no MSVC equivalent, but the question itself shows the GNU C functions and how to use them. – Peter Cordes Aug 19 '19 at 09:07
  • The problem with disabling link time optimization (i.e. -fno-lto), is that in reality you want to enable all optimizations possible and also to check if enabling specific optimization types actually makes code any faster. In my case -flto made code slower in some cases (about 1.15 times), which hints that it makes sense to exclude specific files from link-time optimization or other optimization types, when you're fiddling with flags for release build. Further compilers should have some automation for that kind of trial and error. – SmugLispWeenie Aug 19 '19 at 10:27
  • I was thinking about microbenchmarking a single function or loop. Like you were talking about with "putting code in a loop". Obviously if the code you're timing is across multiple files you can't disable LTO if your real program builds it with LTO enabled. (Except to compare LTO vs. non-LTO). – Peter Cordes Aug 19 '19 at 11:25
  • 1
    I think the best solution would be putting the benchmarking code inside a shared library, and the measuring code into executable, dlopening said library. Now you can also reuse that measurer across several of your projects or test several parts of one project. That is what I'm doing. That should work unless LLVM one day learns to optimize DLL linking or do some other JIT, which is totally possible, although it is a predictability nightmare and just impossible to measure. – SmugLispWeenie Aug 19 '19 at 19:50
2

Every solution's are not working in my system.

I can get using

#include <time.h>

double difftime(time_t time1, time_t time0);
Pang
  • 9,564
  • 146
  • 81
  • 122
  • 3
    This gives the difference between two `time_t` values as a double. Since `time_t` values are only accurate to a second, it is of limited value in printing out the time taken by short running programs, though it may be useful for programs that run for long periods. – Jonathan Leffler Dec 05 '16 at 01:12
  • For whatever reason, passing in a pair of `clock_t`s to `difftime` seems to work for me to the precision of a hundredth of a second. This is on linux x86. I also can't get the subtraction of `stop` and `start` to work. – interestedparty333 Dec 13 '16 at 19:39
  • @ragerdl: You need to pass to `difftime()` `clock() / CLOCKS_PER_SEC`, as it expects seconds. – alk Jul 21 '17 at 15:11
2

Some might find a different kind of input useful: I was given this method of measuring time as part of a university course on GPGPU-programming with NVidia CUDA (course description). It combines methods seen in earlier posts, and I simply post it because the requirements give it credibility:

unsigned long int elapsed;
struct timeval t_start, t_end, t_diff;
gettimeofday(&t_start, NULL);

// perform computations ...

gettimeofday(&t_end, NULL);
timeval_subtract(&t_diff, &t_end, &t_start);
elapsed = (t_diff.tv_sec*1e6 + t_diff.tv_usec);
printf("GPU version runs in: %lu microsecs\n", elapsed);

I suppose you could multiply with e.g. 1.0 / 1000.0 to get the unit of measurement that suits your needs.

alexpanter
  • 1,222
  • 10
  • 25
  • 1
    gettimeofday is obsolete and not recommended. Its POSIX man page recommends [`clock_gettime`](http://pubs.opengroup.org/onlinepubs/009696899/functions/clock_getres.html) instead, which lets you ask for `CLOCK_MONOTONIC` that isn't affected by changes to the system clock, and thus it's better as an interval timer. On modern Linux systems, for example, `gettimeofday` is basically a wrapper for `clock_gettime` that converts nanoseconds to microseconds. (See JohnSll's answer). – Peter Cordes Apr 05 '19 at 07:55
  • This method was added by @Wes Hardaker, the main difference is using `timeval_subtract`. – alexpanter Apr 05 '19 at 07:58
  • Ok, so the only useful part of your answer is the name of a function that you don't define, and that isn't in the standard library. (Only in the glibc manual: https://www.gnu.org/software/libc/manual/html_node/Elapsed-Time.html). – Peter Cordes Apr 05 '19 at 08:09
2

If you program uses GPU or if it uses sleep() then clock() diff gives you smaller than actual duration. It is because clock() returns the number of CPU clock ticks. It only can be used to calculate CPU usage time (CPU load), but not the execution duration. We should not use clock() to calculate duration. We still should use gettimeofday() or clock_gettime() for duration in C.

x4444
  • 2,092
  • 1
  • 13
  • 11
  • `clock()` doesn't count CPU clock ticks; it counts user-space CPU time in units of `CLOCKS_PER_SEC` which is fixed by POSIX at `1000000` (1 M). Counting actual core clock cycles on a CPU that can vary its frequency on its own (turbo / boost clocks) would require hardware performance counters, like `perf stat` uses. But yes, correct that it counts CPU time not wall-clock time. – Peter Cordes Feb 14 '23 at 05:57
1

perf tool is more accurate to be used in order to collect and profile the running program. Use perf stat to show all information related to the program being executed.

0

As simple as possible by using function-like macro

#include <stdio.h>
#include <time.h>

#define printExecTime(t) printf("Elapsed: %f seconds\n", (double)(clock()-(t)) / CLOCKS_PER_SEC)

int factorialRecursion(int n) {
    return n == 1 ? 1 : n * factorialRecursion(n-1);
}

int main()
{
    clock_t t = clock();

    int j=1;
    for(int i=1; i <10; i++ , j*=i);

    printExecTime(t);
    
    // compare with recursion factorial
    t = clock();
    j = factorialRecursion(10);
    printExecTime(t);

    return 0;
}
Zakaria
  • 1,055
  • 12
  • 16
  • Normally you don't want to time `printf("factorial ... %d)`, but that's what this code is doing. Of course, `10!` is so fast to calculate compared to measurement overhead of `clock()` that you're not going to get much from timing just the computation. If the compiler doesn't constant-propagate through the loop to just print a constant `j` value; See also [Idiomatic way of performance evaluation?](https://stackoverflow.com/q/60291987) re: the necessity of compiling with optimization, but also of having the work not optimize away. – Peter Cordes Feb 14 '23 at 05:50
  • Also, `clock` measures user-space CPU time for this process, not wall-clock time. That might be what you want if you don't want to time I/O waits. – Peter Cordes Feb 14 '23 at 05:51
  • This answer doesn't seem to add anything over [an existing answer](https://stackoverflow.com/a/48367129/224132) with a similar print macro. – Peter Cordes Feb 14 '23 at 06:02
  • Thank you Peter for your comments, but the answer is not 100% repeated, TICK TOCK method will define multiple variables if you need to measure multiple parts of code in the same function (like in main), then it's better to define one variable t, initiate it with clock() every time before measurement then call the macro, and so on. And Regarding the for loop of factorial, this is just a sample, feel free to replace it with a recursion version of the factorial function if you want. – Zakaria Feb 15 '23 at 06:03
  • The main reason I downvoted was for the bad example of how to use this macro, with `printf` inside the timed region. At least fix that or discuss in text that you're intentionally benchmarking I/O, if you think this answer adds value. Agreed that the macro details are a bit different, and declaring vars inside the TICK() macro isn't great for some use-cases. Multiple separate `clock_t` vars that aren't used at the same time is basically not a problem for optimizers, so it's just a matter of style. – Peter Cordes Feb 15 '23 at 06:08
  • Ok, that's an improvement; they're still untimeably short intervals for `clock()` (and a challenge even for a raw `rdtsc` on x86), but of course real use cases will want to time other things. – Peter Cordes Feb 15 '23 at 06:36
-2

Comparison of execution time of bubble sort and selection sort I have a program which compares the execution time of bubble sort and selection sort. To find out the time of execution of a block of code compute the time before and after the block by

 clock_t start=clock();
 …
 clock_t end=clock();
 CLOCKS_PER_SEC is constant in time.h library

Example code:

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int main()
{
   int a[10000],i,j,min,temp;
   for(i=0;i<10000;i++)
   {
      a[i]=rand()%10000;
   }
   //The bubble Sort
   clock_t start,end;
   start=clock();
   for(i=0;i<10000;i++)
   {
     for(j=i+1;j<10000;j++)
     {
       if(a[i]>a[j])
       {
         int temp=a[i];
         a[i]=a[j];
         a[j]=temp;
       }
     }
   }
   end=clock();
   double extime=(double) (end-start)/CLOCKS_PER_SEC;
   printf("\n\tExecution time for the bubble sort is %f seconds\n ",extime);

   for(i=0;i<10000;i++)
   {
     a[i]=rand()%10000;
   }
   clock_t start1,end1;
   start1=clock();
   // The Selection Sort
   for(i=0;i<10000;i++)
   {
     min=i;
     for(j=i+1;j<10000;j++)
     {
       if(a[min]>a[j])
       {
         min=j;
       }
     }
     temp=a[min];
     a[min]=a[i];
     a[i]=temp;
   }
   end1=clock();
   double extime1=(double) (end1-start1)/CLOCKS_PER_SEC;
   printf("\n");
   printf("\tExecution time for the selection sort is %f seconds\n\n", extime1);
   if(extime1<extime)
     printf("\tSelection sort is faster than Bubble sort by %f seconds\n\n", extime - extime1);
   else if(extime1>extime)
     printf("\tBubble sort is faster than Selection sort by %f seconds\n\n", extime1 - extime);
   else
     printf("\tBoth algorithms have the same execution time\n\n");
}
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • 7
    This doesn't really add anything new compared with [adimoh](http://stackoverflow.com/users/1442452/adimoh)'s [answer](http://stackoverflow.com/a/17145331/15168), except that it fills in 'the executable code' block (or two of them) with some actual code. And that answer doesn't add anything that wasn't in [Alexandre C](http://stackoverflow.com/users/373025/alexandre-c)'s [answer](http://stackoverflow.com/a/5249129/15168) from two year's earlier. – Jonathan Leffler Dec 05 '16 at 01:22