1

I've tested times() and clock_gettime(CLOCK_MONOTONIC) on several machines, and the results is confused (run each api 10M times with single thread):

Thinkpad P50 Xeon E3-1505Mv5 [Skylake 14nm]:

times(NULL)                   : 450ms
clock_gettime(CLOCK_MONOTONIC): 325ms

You can see on Skylake clock_gettime is faster than times.

Here is the result on a Xeon E5-2430 [Sandy Bridge 32nm]:

times(NULL)                   : 600ms
clock_gettime(CLOCK_MONOTONIC): 1420ms

times(NULL) is faster now.

I also doing the same test on an old Thinkpad W510 I7-720QM [Clarksfield 45nm]:

times(NULL)                   : 1.73s
clock_gettime(CLOCK_MONOTONIC): 20.4s

Seems like there a some new featrues implemented by newer hardware, which boosted the clock_gettime performance?

ASBai
  • 724
  • 2
  • 7
  • 17
  • If your point is to measure performance by getting a clock many times, I suggest you read http://stackoverflow.com/questions/34810392/assembler-instruction-rdtsc and also about `rdtscp`. – John Zwinck Aug 05 '16 at 01:31
  • No, I don't use this code to measure performance. I use it to timing some timers, but it could be launched very frequently. So the effectiveness of the API itself is very important. BTW: windows GetTickCount64 API is very quick, only take 60ms on the same P50 machine. Why linux slower than windows for this function? – ASBai Aug 05 '16 at 01:37
  • For example: to determine if we need to send the next heartbeat package, or examing is a user session stay valid, etc. there are many "monotonic time" intensive usages. – ASBai Aug 05 '16 at 01:52
  • Why not use an actual timer provided by the OS then? On Linux, `timer_settime()` or `timerfd_settime()`.... – John Zwinck Aug 05 '16 at 02:21
  • Because many tasks need timing is fundamentally not a periodically task. Or using timer under some cases will resulting more performance degradation. For example: determine if a user session within a incoming http request is expired, and many more other cases. – ASBai Aug 05 '16 at 02:44
  • 1
    The answer to the stated question does not matter. `clock_gettime(CLOCK_MONOTONIC, &ts)` is fast enough to never be a bottleneck in sane use. If you find you are using it too often, use a signal handler or dedicated thread to update a lockless/atomic time variable, and have the users examine that. (You might suffer a bit from cacheline ping-pong in multithreaded processes; there are many ways to mitigate that. You'll also need to use atomic built-ins, but all the major POSIXy C compilers do provide them, so it turns out not to be any issue for portability, either.) – Nominal Animal Aug 05 '16 at 12:19
  • We are already using a dedicated timing thread many years under some case. And of course with atomic (we have our own assemble atomic implementation, we also use the vc intrinsic, gcc built-in, windows atomic api, linux api etc. some times). But sometimes timing thread is not suitable, For Example: 1. atomic has its own cost (cache invalid, unavoidable memory barrier on some arch like x86 ...); 2. not suitable for using thread in some cases; 3. need accurately timing, and etc. I don't want talk about how to workaround with it, Just want to know what the result means & why it's slower then msw – ASBai Aug 05 '16 at 20:10

1 Answers1

1

Don't times(2)) and clock_gettime(2) measure different things? times() measure CPU used running your process whereas clock_gettime() measures wall clock time regardless of whether your process is "on-CPU" or not.

If the clock used by clock_gettime() is supported by the vDSO version then I would have thought the overhead of calling it would typically be comparable to times() but obviously they could return different results depending on whether your process was descheduled at some point.

Anon
  • 6,306
  • 2
  • 38
  • 56