An answer for those strongly interested in higher performance. Ideally this bit twiddling would be implemented by widely used libraries (e.g., C/C++/Java/Python standard libraries) and we would simply call higher level APIs. I hope this post inspires library writers to implement the code below which I came up with:
unsigned char last_day_of_month(int year, unsigned char month) {
return month != 2 ? ((month ^ (month >> 3))) | 30 :
is_leap_year(year) ? 29 : 28;
}
As everybody knows February is special and we need to check whether year
is leap or not. I assume we already have a very efficient implementation of is_leap_year
(this is another surprisingly whole story). For month == 2
the code should be clear.
Let's focus on the cryptic part ((month ^ (month >> 3))) | 30
which is evaluated when month != 2
. In this case, the result is either 30
or 31
, or in another words, b + 30
, where b
is either 0
or 1
. We further split the reasoning into two sub-cases depending on whether month < 8
or not:
- If
month
is in {1, 3, 4, 5, 6, 7};
Months with 31
days (b == 1
) in this set are 1
(January), 3
(March), 5
(May) and 7
(June), that is, odd numbers. Conversely, even months have 30
days (b == 0
).
- If
month
is in {8, 9, 10, 11, 12};
Now, it's the other way around: months in this set with 31
days (b == 1
) are 8
(August), 10
(October) and 12
(December) thus even, whereas odd months have 30
days (b == 0
).
We conclude that for any month
(but 2
), its length depends on two things: its parity and whether it's >= 8
or not. These properties are determined by two bits of month
: the 0-th for parity and the 3-rd for >= 8
. We just need to check whether those two bits are different (and get b == 1
) or not (and get b == 0
). This can be achieved through a XOR (^
) between these bits, that is, b = ((month >> 0) & 1) ^ ((month >> 3) & 1)
which simplifies to b = (month ^ (month >> 3)) & 1
. Since 30
is even we have b + 30 = b | 30
. (Performance-wise choosing between +
or |
doesn't seem to matter.)
The expression ((month ^ (month >> 3)) & 1) | 30
can be further simplified. (Thanks to Dr. Matthias Kretz for giving me the following suggestion.) Indeed, 30
in binary is 0b11110
and, assuming month <= 12
, for x = (month ^ (month >> 3))
, the only bit of x | 30
that depends on x
is the rightmost so that (x & 1) | 30 == x | 30
meaning that & 1
can be dropped. This yields the implementation above.
I benchmarked the above against an implementation which uses a look-up array. Mine seems to be ~10% faster. This doesn't seem much but notice that my implementation has no memory footprint (everything should be done in registers) and therefore, does not compete for L1 cache lines with other parts of the program (reducing cache thrashing). This is probably not reflected in "benchmark lab conditions" where the only thing the tight loop does is calculating the last day of the month. Production conditions are likely to be different and the performance boost to be greater.
Note:
As some have mentioned there are some historical quirks that make years prior to 1582 meaningless anywhere in the world. Worse, different countries adopted the Gregorian calendar at different times so they were out of sync for centuries. Therefore, from the historical perspective, most computer implementations (including the above) are wrong. However, pragmatically, most of them simply extrapolate the Gregorian calendar in its current form as if it has always existed which yields the so called proleptic Gregorian calendar. I emphasize: the proleptic Gregorian calendar isn't historically accurate but it serves perfectly well most applications.