5

We have a system that serves many distributions centers. A distribution center is a physical space anywhere in the country. One client may have more than one distribution center. Now that we are expanding to more places, we will have to face the problem of different timezones. The same client may have, also, centers in different timezones.

A lot of events can be created and saved (with date and hour) using our system in the client centers. The ideal behavior for different timezones in the same client is the following:

Considering an event that happens in timezone A, at noon. If a supervisor of another distribution center in timezone B goes check this event, he should see the date and hour of that event respecting the timezone where the event was originally created (including DST changes, if existing). This is because what matters to know is if the event was created at noon in the event timezone. For the supervisor does not matter if when the event was created it was 2PM in his timezone.

We use PostgreSQL as our database and I see that exists two different types to save timestamps. TIMESTAMP and TIMESTAMPTZ. All of our database uses only the type TIMESTAMP.

Another scenario that might (or not) happen is the case where a distribution center changes geographically. This may impact in a change of its timezone.

I have done some research and found that the more “correct” approach (at least is what it seems) is to save the timezone of every center in the DISTRIBUTION_CENTER table. Change all types from TIMESTAMP to TIMESTAMPTZ in our database and in every insert of an event that saves the timestamp, we should use the timezone of the center where the event was created to save the offset of the timezone of the event (the TIMESTAMPTZ only saves the offset of the timezone, not the timezone itself).

I’m not fully convinced that this is the right (or best) approach to deal with different timezones. As I never implemented nothing like this, I can’t say.

If I have to follow this approach, I will have to change all column types from TIMESTAMP to TIMESTAMPTZ in our database. All views that depends on that columns in some way in also have to be recreated, because we will be changing the column type. I will also have to change all queries that deals with that columns to apply the center timezone using AT TIMEZONE.

The database has now the America/Sao_Paulo timezone setted up and my fear is ended up doing something wrong when changing the column type for TIMESTAMPTZ. Does this change can potentially break with data consistency? Should I first change the column type or change the database timezone for UTC?

The solution that I described is the best way of doing this?

Does that approach also deals correctly with the DST changes?

Extra information: our server uses java (jersey).

Luiz
  • 51
  • 1
  • 2

2 Answers2

13

These issues have been covered many times on Stack Overflow & the sister site https://dba.stackexchange.com/. Please, next time, search thoroughly before posting. So, just a recap here.

First you must understand the difference between offset-from-UTC and time zone. An offset is merely a number of hours, minutes, and seconds displacement from UTC. A time zone is the history of past, present, and future changes in the offset used by the people of a particular region. So using a time zone is always preferable to a mere offset.

exists two different types to save timestamps. TIMESTAMP and TIMESTAMPTZ

Not precisely. The actual types as defined by the SQL standard are TIMESTAMP WITH TIME ZONE and TIMESTAMP WITHOUT TIME ZONE. The other names are Postgres-specific synonyms. I suggest sticking with the standard names for clarity. Date-time handling is confusing enough without the ambiguity of reading/remembering the z on the end.

The SQL spec barely touches on the topic of date-time handling. So behavior varies widely between database implementations.

The way Postgres works is really quite simple.

  • For a column of type TIMESTAMP WITH TIME ZONE, any input passed with an offset-from-UTC or a time zone is automatically adjusted into UTC. After adjusting, the original value’s offset/zone info is discarded. The “with time zone” really means “with respect for the incoming data’s time zone” rather than “stored with time zone”. If you must know the original offset/zone, you must store that yourself in a separate column. I suggest doing so as text using the ISO 8601 standard formats for offset, or the proper name of timezone. If an input lacks any indicator of zone/offset, the session’s current default time zone is applied and then the adjustment is made to UTC — as I vaguely recall; you should never pass an input lacking zone/offset!
  • For a column of TIMESTAMP WITHOUT TIME ZONE, any zone/offset info with an input (if any) is ignored. Similarly, when retrieved this value has no zone/offset. This type has no concept of zone/offset. Do not use this type when your intention is to store a moment, a point on the timeline. This type is for a vague idea about potential moments along a range of about 26-27 hours, such as “Christmas Day starts after midnight on December 25, 2018”. Such a sentence has no real meaning until you append "in Japan", "in India", or "in France" (thereby creating a value in the other type, TIMESTAMP WITH TIME ZONE). This type is also used for future appointments more than several weeks out, when politicians may potentially change their region’s offset (which they surprisingly do often and with little forewarning).

BEWARE: Confusingly, some tools or drivers may apply the session’s current default time zone to values of either type. This includes pgAdmin. A terrible anti-feature in my opinion. Well-intentioned, but such a tool/driver sitting between you and Postgres should not be injecting its “opinion” about the data in transit. Doing so creates the illusion that the data retrieved carries that zone inside the database when precisely the opposite is true (actually carries either either UTC or no zone/offset). If your tool makes such an adjustment, it is likely controlled by a zone/offset setting in your Postgres session, as discussed here.

Best practice in date-time handling is to think, work, store, log, and exchange data using UTC. Think of other zones as mere variations on that theme. Adjust into time zones only when required by business logic or for presentation to users. Forget all about your own parochial time zone. Get a second clock on your desk set to UTC – seriously.

The database has now the America/Sao_Paulo timezone setted up

The default time zone of your server OS should be irrelevant. Never depend on such a default as a programmer as it is well out of your control, and is so easily changed.

In Java, the JVM has is own current default time zone separate from the host OS. The JVM’s current default may be changed during runtime at any moment by any code in any thread of any app within that JVM. So never depend on the current default. Always specify your desired/expected time zone explicitly by passing the optional argument.

If a supervisor of another distribution center in timezone B goes check this event, he should see

As discussed above, on the database side you should be working in UTC. Adjusting into a time zone expected by the user is a user-interface task, not a database task. Just like with internationalization, where you would store some kind of key-lookup value in the database, to be localized to some human language on the user-interface side.

the TIMESTAMPTZ only saves the offset of the timezone, not the timezone itself

No, incorrect. As discussed above, the TIMESTAMP WITH TIME ZONE type discards the offset/zone info after adjusting into UTC for storage. No offset, no zone, just a UTC moment is stored in the column — basically, a count of microseconds since an epoch reference.

Change all types from TIMESTAMP to TIMESTAMPTZ in our database and in every insert of an event that saves the timestamp, we should use the timezone of the center where the event was created to save the offset of the timezone of the event

If you are saying that you already have recorded date-time values from various time zones into a column of TIMESTAMP WITHOUT TIME ZONE, then you have an awful mess. You cannot reliably clean it up, not with full certainty of accuracy, as you do not really know what zone/offset was originally intended for the inputs passed to the database. You can guess the original intent of the stored data, but you can never be sure.

Explain to your boss & stakeholders that this is not a mess of your making. Explain that whoever set up this database & app did the equivalent of storing money amounts in various currencies such as Yen, Canadian dollars, British pounds, and Euros without bothering to record which currency on each amount.

If you want to guess, you would need to know the name of time zones that were likely used.

In Java, use only the java.time classes built into Java 8 and later. The older date-time classes are a bloody awful mess, now legacy, supplanted by java.time as defined in JSR 310.

Identify your possible zones.

ZoneId zoneSaoPaulo = ZoneId.of( "America/Sao_Paulo" ) ;
ZoneId zoneLisbon = ZoneId.of( "Europe/Lisbon" ) ;
ZoneId zoneKolkata = ZoneId.of( "Asia/Kolkata" ) ;

Extract the date-time value as a LocalDateTime, the Java class for a date-time value lacking any concept of zone/offset. With JDBC 4.2 and later, you may directly exchange java.time objects with the database.

LocalDateTime ldt = myResultSet.getObject( … , LocalDateTime.class ) ;

Perhaps an enum would be appropriate way to represent your distribution centers. This assumes the list need not change during runtime.

public enum DistributionCenter {
    // List the constants to be constructed automatically when this class loads.
    SAOPAULO( ZoneId.of( "America/Sao_Paulo" ) ) ,
    LISBON( ZoneId.of( "Europe/Lisbon" ) ) ,
    KOLKATA( ZoneId.of( "Asia/Kolkata" ) ) 

    final public ZoneId zoneId ;  // Make public, or add a getter method to access private member.

    // Add constructor taking the passed `ZoneId` and storing in the variable.
}

Apply the zone, to generate a ZonedDateTime object. Now we have an actual moment, a specific point on the timeline.

DistributionCenter dc = … ;
ZonedDateTime zdt = ldt.atZone( dc.zoneId ) ;

Adjust that value into a UTC value. Same moment, same point on the timeline, different wall-clock time. Do not proceed with your project until you understand that concept clearly.

The Instant class represents a moment on the timeline in UTC with a resolution of nanoseconds (up to nine (9) digits of a decimal fraction).

Instant instant = zdt.toInstant() ;

You should be able to pass your ZonedDateTime object to your JDBC driver for adjustment into UTC. I just want to drive home the point that we are ending up with a UTC value in Postgres storage. Plus, I do convert to Instant myself for easy debugging – remember: UTC is The One True Time.

Now that we have determined an actual moment, we can store it in the database.

myPreparedStatement.setObject( … , instant ) ;

Note how none of this code depends on the current default time zone of your server host OS, your Postgres cluster, or your JVM.

I will have to change all column types from TIMESTAMP to TIMESTAMPTZ in our database

Yes. Data recording an actual moment, a piece of history, should never have been stored in TIMESTAMP WITHOUT TIME ZONE. Some naïve programmers/DBAs hope that using this data type may somehow exempt them from dealing with time zone issues. But actually this is a “pay now, or pay later” situation. Unfortunately, you are the one stuck paying for their poor choice.

You likely could do this same kind of work within a Postgres procedure. Postgres does have much better support for date-time work than most databases. However, nothing beats the java.time classes for date-time handling. And, personally I would rather debug and practice this particular chore within Java.

distribution center changes geographically

That is confusing and unwise. The business really should identify the new location as a new center, not the same. If you cannot convince management to do so, I would do so within your database and apps behind-the-scenes.


About java.time

The java.time framework is built into Java 8 and later. These classes supplant the troublesome old legacy date-time classes such as java.util.Date, Calendar, & SimpleDateFormat.

To learn more, see the Oracle Tutorial. And search Stack Overflow for many examples and explanations. Specification is JSR 310.

The Joda-Time project, now in maintenance mode, advises migration to the java.time classes.

You may exchange java.time objects directly with your database. Use a JDBC driver compliant with JDBC 4.2 or later. No need for strings, no need for java.sql.* classes. Hibernate 5 & JPA 2.2 support java.time.

Where to obtain the java.time classes?

Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154
  • Impressive writeup. The `TimeZone` configuration parameter is only *initialized* from the server time zone, later on the server time zone plays no role. `TimeZone` is not the database time zone, it is the default value for the session time zone. PostgreSQL has no concept of a database time zone. – Laurenz Albe Feb 24 '18 at 03:40
  • Thanks for your answer! Until now, we only have timestamps from one timezone saved in columns `TIMESTAMP WITHOUT TIMEZONE` (luckly). If a DC changes geographically and also its timezone, the future created events should take into account the new timezone, for the ones already stored: nothing changes. I’m not able to create a new DC if this happens because of the previous data. A supervisor must be able to see the data created before the DC migrate for the new place as well as the new data (every data respecting its own different timezone, supposing that the DC change also changes its tz). – Luiz Feb 24 '18 at 17:27
  • Unfortunately, because of a deadline I’m not able to change to the new java 8 API. Our servers will still stick with the old java.util.Date and java.sql.Date. Our website and Android App both receives timestamps using serialized java.util.Date objects without any information about TZ. In theory, the data should already be ready to display (only format required). As this is a requirement (do not trust in a client tz to apply changes), I’m not allowed to change. So, where and how should I apply TZ changes to DB timestamps (taking into account DST changes)? – Luiz Feb 24 '18 at 17:34
2

Read Basil's impressive answer to understand the concepts.

You should definitely switch to timestamp with time zone, which in reality is an absolute timestamp (in slight violation of the SQL standars's intention).

One thing you want to conaider is if the time zone of an event should change if the time zone of the center that recorded it changes. If not, you will either have to keep a history of time zones for a center or (better) store the time zone with each event as it is created.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
  • Thanks for your answer! As I comment in the Basil's answer, if the DC changes its timezone, for the already stored events nothing changes. But for the new ones, we should consider the new DC timezone when saving data. When consulting new and old data at the same time, the timezone of the event must be the timezone of the DC when the event was created. – Luiz Feb 24 '18 at 17:40
  • Then I'd store the time zone with the event. – Laurenz Albe Feb 24 '18 at 17:42
  • I see. So for every `timestamp with time zone` column in the DB I would also need a extra column to save the timezone. And when querying that data I would be able to use `AT TIME ZONE`, right!? – Luiz Feb 24 '18 at 18:08
  • Yes. Of course, if you can easily get the time zone from a related table, you shouldn't store it twice. – Laurenz Albe Feb 24 '18 at 18:12
  • Another thing that is bothering me: if an event is saved when DST is applied and then a supervisor look at this event when DST is off (supposing we query the data using `AT TIME ZONE`), he will see the wrong time? – Luiz Mar 02 '18 at 21:24
  • If you store the time zone along with the event, you will get the result in that time zone. That value won't change no matter when you watch it, unless there are historical changes in the time zone laws (that happens). – Laurenz Albe Mar 02 '18 at 21:31
  • Right, I get it. But if an event is saved in Brazil in December at 2 PM (when DST is on) and a supervisor opens that event in March (when DST is off) the time will show 1 PM? One hour earlier compared with the time when the event really happened in December? Sorry, but all that timezone thing will certainly drives me crazy. :s – Luiz Mar 02 '18 at 21:48
  • No, it will still display 2 p.m. even if you use `AT TIME ZONE 'America/Sao_Paulo'` in March. Because it *was* 2 p.m. at that time in that time zone. Don't worry. – Laurenz Albe Mar 02 '18 at 21:54