How intensive is getting time?

Question

Here's a low level question. How CPU intensive is getting system time?

What is the source of the time? I know there is a hardware clock on the bios chip but I'm thinking that getting data from outside the CPU and RAM will need some hardware synchronization which may delay the read so I'm guessing the CPU may have its own clock. Feel free to correct me if I'm wrong in any way.

Does getting time incur a heavy system function call or is it in any way dependent on the used programming language?

since getting time is used to profile down to milliseconds and even nanoseconds resolution, I suppose the overhead is negligible. — bolov, Aug 12 '16 at 08:48
Yup, the overhead is minuscule. Here the way getting the time is more or less traced: http://stackoverflow.com/a/38916054/357403 — Koshinae, Aug 12 '16 at 10:53

Axel Kemper · Accepted Answer · 2016-08-12T13:34:10.987

I have just tested it using a C++ program:

clock_t started = clock();
clock_t endClock = started + CLOCKS_PER_SEC;
long itera = 0;

for (; clock() < endClock; itera++)
{
}

I get about 23 million iterations per second (Windows 7, 32bit, Visual Studio 2015, 2.6 GHz CPU). In terms of your question, I would not call this intensive. In debug mode, I measured 18 million iterations per second.

In case the time is transformed into a localized timestamp, complicated calendar calculations (timezone, daylight saving time, ...) might significantly slow down the loop.

It is not easy to tell what happens inside the clock() call. For my system, it calls QueryPerfomanceCounter, but this recurs to other system functions as explained here.

Tuning

To reduce the time measurement overhead even further, you can measure in every 10th, 100th ... iteration.

The following measures once in 1024 iterations:

 for (; (itera & 0x03FF) || (clock() < endClock); itera++)
    {
    }

This brings up the loop per second count to some 500 million.

Tuning with Timer Thread

The following yields a further improvement of some 10% paid with additional complexity:

 std::atomic<bool> processing = true;

 //  launch a timer thread to clear the processing flag after 1s
 std::thread t([&processing]() {
     std::this_thread::sleep_for(std::chrono::seconds(1));
     processing = false;        
 });

 for (; (itera & 0x03FF) || processing; itera++)
    {
    }

 t.join();

An extra thread is started which sleeps for one second and then sets a control variable. The main thread executes the loop until the timer threads signals the end of processing.

If your measurement is correct, that's 43.5 nanoseconds per iteration. Did you run with optimizations enable? — bolov, Aug 12 '16 at 08:51
@bolov: I tried Release and Debug mode. As I printed the iteration count, the optimizer should not have eaten the loop. But you are right, the short time is surprising and might deserve further analysis. — Axel Kemper, Aug 12 '16 at 09:06
it's not surprising. It's not short. What are times you got in release and in debug? — bolov, Aug 12 '16 at 09:09
How is that not intensive though? 43ns is enough for some 700 fast instructions on a modern Intel (assuming 4GHz and optimal instruction mix), you could do a lot with that. — harold, Aug 12 '16 at 11:54
@harold: yes, you are right. Therefore, I have added a tuning section to get down to 2ns — Axel Kemper, Aug 12 '16 at 12:04

How intensive is getting time?

1 Answers1