Why is gmtime implemented this way?

Question

I happened across the source for Minix's gmtime function. I was interested in the bit that calculated the year number from days since epoch. Here are the guts of that bit:

http://www.raspberryginger.com/jbailey/minix/html/gmtime_8c-source.html

http://www.raspberryginger.com/jbailey/minix/html/loc__time_8h-source.html

#define EPOCH_YR 1970
#define LEAPYEAR(year) (!((year) % 4) && (((year) % 100) || !((year) % 400)))
#define YEARSIZE(year) (LEAPYEAR(year) ? 366 : 365)

int year = EPOCH_YR;

while (dayno >= YEARSIZE(year)) {
    dayno -= YEARSIZE(year);
    year++;
}

It looks like the algorithm is O(n), where n is the distance from the epoch. Additionally, it seems that LEAPYEAR must be calculated separately for each year – dozens of times for current dates and many more for dates far in the future. I had the following algorithm for doing the same thing (in this case from the ISO-9601 epoch (Year 0 = 1 BC) rather than UNIX epoch):

#define CYCLE_1   365
#define CYCLE_4   (CYCLE_1   *  4 + 1)
#define CYCLE_100 (CYCLE_4   * 25 - 1)
#define CYCLE_400 (CYCLE_100 *  4 + 1)

year += 400 * (dayno / CYCLE_400)
dayno = dayno % CYCLE_400

year += 100 * (dayno / CYCLE_100)
dayno = dayno % CYCLE_100

year +=   4 * (dayno / CYCLE_4)
dayno = dayno % CYCLE_4

year +=   1 * (dayno / CYCLE_1)
dayno = dayno % CYCLE_1

This runs in O(1) for any date, and looks like it should be faster even for dates reasonably close to 1970.

So, assuming that the Minix developers are Smart People who did it their way for a Reason, and probably know a bit more about C than I do, why?

It *looks* like it should be faster. You need to think about your architecture and how fast certain instructions like multiply are and how good a branch predictor you have is (most are very good). @jim mcnamara has some interesting results. — BobbyShaftoe, Jun 28 '10 at 23:46

score 7 · Answer 1 · answered Jun 28 '10 at 22:00

Ran your code as y2 minix code as y1 Solaris 9 v245 & got this profiler data:

 %Time Seconds Cumsecs  #Calls   msec/call  Name
  79.1    0.34    0.34   36966      0.0092  _write
   7.0    0.03    0.37 1125566      0.0000  .rem
   7.0    0.03    0.40   36966      0.0008  _doprnt
   4.7    0.02    0.42 1817938      0.0000  _mcount
   2.3    0.01    0.43   36966      0.0003  y2
   0.0    0.00    0.43       4      0.      atexit
   0.0    0.00    0.43       1      0.      _exithandle
   0.0    0.00    0.43       1      0.      main
   0.0    0.00    0.43       1      0.      _fpsetsticky
   0.0    0.00    0.43       1      0.      _profil
   0.0    0.00    0.43   36966      0.0000  printf
   0.0    0.00    0.43  147864      0.0000  .div
   0.0    0.00    0.43   73932      0.0000  _ferror_unlocked
   0.0    0.00    0.43   36966      0.0000  memchr
   0.0    0.00    0.43       1      0.      _findbuf
   0.0    0.00    0.43       1      0.      _ioctl
   0.0    0.00    0.43       1      0.      _isatty
   0.0    0.00    0.43   73932      0.0000  _realbufend
   0.0    0.00    0.43   36966      0.0000  _xflsbuf
   0.0    0.00    0.43       1      0.      _setbufend
   0.0    0.00    0.43       1      0.      _setorientation
   0.0    0.00    0.43  137864      0.0000  _memcpy
   0.0    0.00    0.43       3      0.      ___errno
   0.0    0.00    0.43       1      0.      _fstat64
   0.0    0.00    0.43       1      0.      exit
   0.0    0.00    0.43   36966      0.0000  y1

Maybe that is an answer

it seems the looped implementation runs in 0secs where as the modulo implementation in 0.01 sec. — Lefteris E, Apr 11 '13 at 12:12
Definitely a case for doing benchmarking. I would have bet on the asker's code being faster before seeing these results. — ryyker, Sep 12 '13 at 00:50

score 2 · Accepted Answer · answered Jun 29 '10 at 06:07

2

This is pure speculation, but perhaps MINIX had requirements that were more important than execution speed, such as simplicity, ease of understanding, and conciseness? Some of the code was printed in a textbook, after all.

answered Jun 29 '10 at 06:07

bk1e

23,871
6
54
65

score 1 · Answer 3 · answered Jun 28 '10 at 21:51

1

Your method seems sound, but it's a little more difficult to get it to work for EPOCH_YR = 1970 because you are now mid-cycle on several cycles.

Can you see if you have an equivalent for that case and see whether it's still better?

You're certainly right that it's debatable whether that gmtime() implementation should be used in any high-performance code. That's a lot of busy work to be doing in any tight loops.

answered Jun 28 '10 at 21:51

Cade Roux

88,164
40
182
265

1

Just subtracting the number of days from years 0..1970 would work. This would be constant, could be stored as another #define. – Adam Shiemke Jun 28 '10 at 21:56
1

As Adam said, simply subtracting the offset would fix that, although it would overflow a 32-bit timestamp. Setting the epoch to 2000 AD would fix both problems, requiring one extra operation before and after the main computation. If 64-bit timestamps are available, ISO year zero is probably better as it a) saves an operation, b) is more obvious, and c) easy calendar-agnostic day-of-week. On the other hand, if you're stuck with a 32-bit timestamp with a modern-day epoch, you might as well forget the 100- and 400-year cycles, since 2000 is a leap year anyways and 1900 and 2100 are out of range. – Thom Smith Jun 28 '10 at 22:06
@Thom Smith The # of days in 2000 years (~730k) is well below 2^32. Neither code snip is concerned with seconds. – Adam Shiemke Jun 28 '10 at 22:12
I cut out the other bits; they're in the links. The gmtime function takes in a standard UNIX timestamp from which dayno is derived. – Thom Smith Jun 28 '10 at 22:23
@Thom Smith: You may argue that "ISO year zero is probably better", but existing practice on POSIX systems is to redefine `time_t` as a 64 bit type with the same epoch, as this allows old programs to continue to work. – caf Jun 29 '10 at 02:04
If you have 64-bit timestamps, then both epochs have virtually identical performance anyway, so it shouldn't affect the code in question. – Thom Smith Jun 30 '10 at 16:32

score 0 · Answer 4 · answered Sep 12 '13 at 00:14

0

Correct approach. You definitely want to go for an O(1) algo. Would work in Mayan calendar without ado. Check the last line: dayno is limited to 0..364, although in leap years it needs to range 0..365 . The line before has a similar flaw.

answered Sep 12 '13 at 00:14

frankencode

51
2

Why is gmtime implemented this way?

4 Answers4

Linked