1

I write a stat server to count visit data of each day, therefore I have to clear data in db (memcached) every day.

Currently, I'll call gettimeofday to get date and compare it with the cached date to check if there are of the same day frequently.

Sample code as belows:

void report_visits(...) {
   std::string date = CommonUtil::GetStringDate(); // through gettimeofday
   if (date != static_cached_date_) {
       flush_db_date();
       static_cached_date_ = date;
   }
}

The problem is that I have to call gettimeofday every time the client reports visit information. And gettimeofday is time-consuming.

Any solution for this problem ?

Shamas S
  • 7,507
  • 10
  • 46
  • 58
oyjh
  • 1,248
  • 1
  • 9
  • 20
  • How frequent is this code executed? Or, what percentage of CPU is spent in `gettimeofday()`? – meaning-matters Jul 29 '15 at 09:08
  • What's the content of CommonUtil::GetStringDate() ? – Richard Dally Jul 29 '15 at 09:09
  • Why do you compare strings ? It's not smart and very time-consuming to format a date into a string when all you have to do is compare the number of seconds since the epoch modulo 60*60*24. But there's no way around gettimeofday, AFAICS. – Vincent Fourmond Jul 29 '15 at 09:15
  • @meaning-matters Very frequently, nearly thousands of times per second. – oyjh Jul 30 '15 at 01:18
  • @LeFlou Just use gettimeofday to get time and use snprintf to format it into string. – oyjh Jul 30 '15 at 01:19
  • @VincentFourmond Thanks for your suggestion, snprintf or ostringstream is time-consuming, I'll improve it. – oyjh Jul 30 '15 at 01:20
  • `gettineofday`, at least on Linux, uses [vDSO](https://stackoverflow.com/questions/19938324/what-are-vdso-and-vsyscall), which makes it very fast - it's unlikely to be bottleneck. Profile, then ask! – el.pescado - нет войне Jul 12 '17 at 14:19

3 Answers3

3

The gettimeofday system call (now obsolete in favor of clock_gettime) is among the shortest system calls to execute. The last time I measured that was on an Intel i486 and lasted around 2us. The kernel internal version is used to timestamp network packets, read, write, and chmod system calls to update the timestamps in the filesystem inodes, and the like. If you want to measure how many time you spent in gettimeofday system call you just have to do several (the more, the better) pairs of calls, one inmediately after the other, annotating the timestamp differences between them and getting finally the minimum value of the samples as the proper value. That will be a good aproximation to the ideal value.

Think that if the kernel uses it to timestamp each read you do to a file, you can freely use it to timestamp each service request without serious penalty.

Another thing, don't use (as suggested by other responses) a routine to convert gettimeofday result to a string, as this indeed consumes a lot more resources. You can compare timestamps (suppose them t1 and t2) and,

gettimeofday(&t2, NULL);
if (t2.tv_sec - t1.tv_sec > 86400) {  /* 86400 is one day in seconds */
    erase_cache();
    t1 = t2;
} 

or, if you want it to occur everyday at the same time

gettimeofday(&t2, NULL);
if (t2.tv_sec / 86400 > t1.tv_sec / 86400) {
    /* tv_sec / 86400 is the number of whole days since 1/1/1970, so
     * if it varies, a change of date has occured */
    erase_cache();
}
t1 = t2; /* now, we made it outside, so we tie to the change of date */

Even, you can use the time() system call for this, as it has second resolution (and you don't need to cope with the usecs or with the overhead of the struct timeval structure).

Luis Colorado
  • 10,974
  • 1
  • 16
  • 31
1

(This is an old question, but there is an important answer missing:)

You need to define the TZ environment variable and export it to your program. If it is not set, you will incur a stat(2) call of /etc/localtime... for every single call to gettimeofday(2), localtime(3), etc.

Of course these will get answered without going to disk, but the frequency of the calls and the overhead of the syscall is enough to make an appreciable difference in some situations.

Supporting documentation:

How to avoid excessive stat(/etc/localtime) calls in strftime() on linux?

https://blog.packagecloud.io/eng/2017/02/21/set-environment-variable-save-thousands-of-system-calls/

Law29
  • 637
  • 1
  • 10
  • 16
  • While I see impeding trouble with any call to get a localised time value, how does this apply to `time(2)`? – greybeard Jul 13 '17 at 02:54
  • @greybeard You are correct, `time(2)` does not check TZ / `/etc/localtime`. Corrected to `localtime(3)` – Law29 Jul 14 '17 at 05:45
0

To summarise:

  1. The check, as you say, is done up to a few thousand times per seconds.
  2. You're flushing a cache once every day.

Assuming that the exact time at which you flush is not critical and can be seconds (or even minutes perhaps) late, there is a very simple/practical solution:

void report_visits(...)
{
    static unsigned int counter;

    if ((counter++ % 1000) == 0)
    {
        std::string date = CommonUtil::GetStringDate();
        if (date != static_cached_date_)
        {
            flush_db_date();
            static_cached_date_ = date;
        }
    }
}

Just do the check once every N-times that report_visits() is called. In the above example N is 1000. With up to a few thousand checks per seconds, you'll be less than a second (or 0.001% of a day) late.

Don't worry about counter wrap-around, it only happens once in about 20+ days (assuming a few thousand checks/s maximum, with 32-bit int), and does not hurt.

meaning-matters
  • 21,929
  • 10
  • 82
  • 142
  • 1
    Converting a timestamp from `gettimeofday` to string consumes by far more resources than the call itself. Don't do that. There are other more efficient ways to check if date has changed (like the one pointed in my response) – Luis Colorado Jul 31 '15 at 12:35
  • I'm offering a different angle to solve this. And my `(counter++ % 1000) == 0` is more efficient than whatever you do. – meaning-matters Jul 31 '15 at 13:42
  • Yes, even in that case, probably it is.... One single use of GetStringDate can trigger reading locale files to properly format the string, forcing disk accesses that can delay for more than 1000 executions of the `gettimeofday` system call. – Luis Colorado Jul 31 '15 at 13:44
  • @LuisColorado Nonsense, a function call plus execution of it's code surely takes many more CPU cycles than my inline `counter` code! Will this disk access happen every few seconds or more? If not, my solution is the fastest, which could then be further improved by using your code. There's nothing wrong with my idea/angle! – meaning-matters Jul 31 '15 at 13:57