clock()
call doesn't give you the cpu time used by your program. It gives you the number of ticks elapsed since boot (or last wraparound, if you have a large uptime) which was the only way to get subsecond timing in old legacy unix. But it is not process time, it is wall clock time.
As to illustrate this, I shall cite a paragraph from the clock(3)
man page in the FreeBSD 13.0-STABLE distribution:
The clock()
function conforms to ISO/IEC 9899:1990 (“ISO C90”). However, Version 2 of the Single UNIX Specification (“SUSv2”) requires CLOCKS_PER_SEC
to be defined as one million. FreeBSD does not conform to this requirement; changing the value would introduce binary incompatibility and one million is still inadequate on modern processors.
Today, you can use the widespread gettimeofday()
system call (also wall clock), which will give you the time of day from the unix epoch, with microsecond resolution (which is far more resolution than the tick of the clock call, and you don't need to know about the CLOCKS_PER_SEC
constant) and this call is today the most portable way to do as almost all the unices available implement it, or better, the newer POSIX system calls clock_gettime(2)
and friends, that allows you to use nanosecond resolution (if available) and allows you to select a process running clock (one that will give you your cpu time, and not the wall clock time)
This last system call is not available everywhere, but if your system claims to be posix, then you'll probably have a subset of the clocks specified.
int clock_gettime(clockid_t clk_id, struct timespec *tp);
where
struct timespec
is filled with the clock time referred in clk_id
The fields of the struct timespec
structure are:
tv_sec
a time_t
field representing the number of seconds since unix epoch (01/01/1970 00:00:00 UTC time)
tv_nsec
a long
value representing the number of nanoseconds since the last second tick. Its range goes from 0 to 999999999.
The values of the different clock ids have been taken from the linux online manual (Ubuntu release) The ones not marked as linux specific are POSIX, so their ids should be portable (although probably not as precise or quick as the ones implemented by the linux kernel)
CLOCK_REALTIME
wall clock.
CLOCK_REALTIME_COARSE
(linux specific) faster than the previous, but less precise.
CLOCK_MONOTONIC
Wall clock but ensures that different calls will always give you ascending time (whatever this can mean). As the clock_gettime(2)
manpage says, this will grow even if you make a system clock adjust backwards to the master clock.
CLOCK_MONOTONIC_COARSE
(Linux specific) similar to the above, but monotonic.
CLOCK_MONOTONIC_RAW
(linux specific)
CLOCK_BOOTTIME
(linux specific) similar to the clock you used, but in ns instead of clock ticks.
CLOCK_PROCESS_CPUTIME_ID
(linux specific) process time (not wall clock time) used by the whole process.
CLOCK_THREAD_CPUTIME_ID
(linux specific) thread time (not wall clock time) this is the time your thread has been using the cpu, and this is the clock I think you must read.
So, finally your snippet could get:
struct timespec t0, t1;
int res = clock_gettime(CLOCK_THREAD_CPUTIME_ID, &t0);
// check errors from res.
your_fun();
int res = clock_gettime(CLOCK_THREAD_CPUTIME_ID, &t1);
// we'll subtract t0 from t1, so we get the delay in t1.
if (t1.tv_nsec < t0.tv_nsec) { // carry to the seconds part.
t1.tv_nsec += 1000000000L - t0.tv_nsec;
t1.tv_sec--;
} else {
t1.tv_nsec -= t0.tv_nsec;
}
t1.tv_sec -= t0.tv_sec;
// no need to convert to double or do floating point arithmetic.
printf("Seconds: %d.%09d\n", t1.tv_sec, t1.tv_nsec);
You can wrap those calls into a callback function, that gives you as a result the execution time:
struct timespec *time_wrapper(
void (*callback)(), // function to be called.
struct timespec *work) // provide a working space so you don't need to allocate it.
{
struct timespec t0;
int res = clock_gettime(CLOCK_THREAD_CPUTIME_ID, &t0);
// check errors from res.
if (res < 0) return NULL; // check errno.
callback();
int res = clock_gettime(CLOCK_THREAD_CPUTIME_ID, work);
if (res < 0) return NULL; // check errno
// we'll subtract t0 from *work, so we get the delay in *work.
if (work->tv_nsec < t0.tv_nsec) { // carry to the seconds part.
work->tv_nsec += 1000000000L - t0.tv_nsec;
work->tv_sec--;
} else {
work->tv_nsec -= t0.tv_nsec;
}
work->tv_sec -= t0.tv_sec;
return work;
}
and you can call it as:
#include <string.h>
#include <errno.h>
#include <stdlib.h>
#include <stdio.h>
int main()
{
struct timespec delay;
if (time_wrapper(your_fun, &delay) == NULL) { // some error
fprintf(stderr, "Error: %s\n", strerror(errno));
exit(EXIT_FAILURE);
}
printf("CPU Seconds: %lu.%09lu",
(unsigned long) delay.tv_sec,
(unsigned long) delay.tv_nsec);
exit(EXIT_SUCCESS);
}
One final note: Compiler optimization cannot reorder the code in a way that makes the order you imposed in your statements to do a different thing than the one you expect. The only way your code can be affected by optimization is that you have incurred in some Undefined Behaviour in your program, and this means your code is incorrect.
If your code is correct, then optimization in the compiler cannot affect it.