I'm working with Drupal 7, PHP, and MySQL, and trying to get my head around timezone issues when querying from the MySQL database.
Drupal handles dates pretty well out of the gate: all dates are stored as UTC timestamp int's in the db, and timezone conversion happens on a per-user basis via a timezone setting in each user profile, using PHP 5's built-in timezone capabilities (so each time a PHP script is run the script's timezone is set to the current user's timezone).
So this is all fine and dandy and fairly painless as long as we stick to PHP.
Things start to get tricky when we bring in MySQL, since there doesn't appear to be a way to perfectly synchronize the current PHP script timezone with a given MySQL query. It seems like best practice dictates handling ALL timezone conversion in PHP: only ever querying the database for the raw timestamps, and then converting in PHP as necessary.
This seems reasonable in most cases (even if a bit slower at times), but what am I supposed to do with MySQL GROUP BY [date] queries? For instance, I'm building a module to handle analytics, and frequently want to do things like:
GROUP BY YEAR(FROM_UNIXTIME(u.created)), MONTH(FROM_UNIXTIME(u.created))
So we run into the timezone issue...
Possible solutions that have come to mind:
Hard-code a timezone: Use date_default_timezone_set() within my module to insure that the PHP timezone is always set to the system timezone (so MySQL timezone = PHP timezone). In other words, the analytics timezone will be hard-coded, rather than respecting the timezone of the user viewing the analytics. This really isn't ideal, since we want users in multiple timezones to be able to access the analytics using their timezones. Also, date_default_timezone_set() seems to mess Drupal up, as it sets the timezone for the entire script, instead of just within a particular function...
Forget about using GROUP BY in the queries: just fetch all the raw data from the db (be there tens or hundreds of thousands of rows), then group the results by date in php using for loops... this solution seems like it would be significantly more resource intensive, slower, and somewhat ridiculous.
So I guess what I'm asking is, have I missed something? Is there a best practice here that I'm not aware of?
Thanks a ton for any help!