26

I'm currently trying to write a suite of time zone validation programs to see whether various platforms interpret the IANA time zone data.

The output format I'm targeting includes the abbreviation in effect for a particular time - such as "BST" for "British Summer Time", or "PST" for "Pacific Standard Time".

On most platforms, this is easy - but ICU4J seems not to be working, oddly. According to the SimpleDateFormat documentation I should be able to use a pattern of "zzz" to get what I'm looking for, but this seems to fall back to the "O" pattern of GMT+X for a lot of the time. For some time zones, there are no abbreviations at all.

Short example using New York:

import java.util.Date;
import java.util.Locale;
import com.ibm.icu.util.TimeZone;
import com.ibm.icu.text.SimpleDateFormat;

public class Test {
    public static void main(String[] args) {
        TimeZone zone = TimeZone.getTimeZone("America/New_York");
        SimpleDateFormat format = new SimpleDateFormat("zzz", Locale.US);
        format.setTimeZone(zone);

        // One month before the unix epoch
        System.out.println(format.format(new Date(-2678400000L))); // GMT-5

        // At the unix epoch
        System.out.println(format.format(new Date(0L))); // EST
    }
}

(I'm running using ICU4J 55.1, both the stock download and after updating it with the 2015e data release.)

It's not clear to me whether ICU4J is getting its abbreviations from the tz data or from CLDR - I suspect it's the latter, given that there's nothing in the tz data to suggest a difference here.

It also seems to be affected by locale, which I suppose is reasonable - using the US locale I can see EST/EDT for America/New_York, but nothing for Europe/London; with the UK locale I see GMT/BST for Europe/London, but nothing for America/New_York :(

Is there a way to persuade ICU4J to fall back to tz abbreviations? In my very specific case, that's all I'm looking for.

Update

Thanks to RealSkeptic's comments, it looks like TimeZoneNames is a cleaner way of getting this data without formatting. It all sounds so promising - there's even TimeZoneNames.getTZDBInstance:

Returns an instance of TimeZoneNames containing only short specific zone names (TimeZoneNames.NameType.SHORT_STANDARD and TimeZoneNames.NameType.SHORT_DAYLIGHT), compatible with the IANA tz database's zone abbreviations (not localized).

That's pretty much exactly what I want - but that doesn't go earlier than 1970 either in most cases, nor does it include all the relevant data:

import static com.ibm.icu.text.TimeZoneNames.NameType.SHORT_STANDARD;

import com.ibm.icu.text.TimeZoneNames;
import com.ibm.icu.text.TimeZoneNames.NameType;
import com.ibm.icu.util.ULocale;

public class Test {
    public static void main(String[] args) {
        TimeZoneNames names = TimeZoneNames.getTZDBInstance(ULocale.ROOT);

        long december1969 = -2678400000L;
        // 24 hours into the Unix epoch...
        long january1970 = 86400000L;

        // null
        System.out.println(
            names.getDisplayName("America/New_York",  SHORT_STANDARD, december1969));
        // EST
        System.out.println(
            names.getDisplayName("America/New_York",  SHORT_STANDARD, january1970));

        // null
        System.out.println(
            names.getDisplayName("Europe/London",  SHORT_STANDARD, december1969));
        // null
        System.out.println(
            names.getDisplayName("Europe/London",  NameType.SHORT_STANDARD, january1970));
    }
}

Given that there's really very little indirection at this point - I'm telling ICU4J exactly what I want - my suspicion is that the information just isn't available :(

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • @RealSkeptic: Not sure what you mean - I'm specifying the time zone for the format, so that should be okay... what are you linking the calendar to, in your reading of the docs? – Jon Skeet Jul 25 '15 at 12:35
  • 1
    Sorry, I misinterpreted what you asked there. Have you tried using `format.setTimeZoneFormat(format.getTimeZoneFormat().setTimeZoneNames(TimeZoneNames.getTZDBInstance(ULocale.US)))`? – RealSkeptic Jul 25 '15 at 13:48
  • @RealSkeptic: Nope - will give that a go in a minute! – Jon Skeet Jul 25 '15 at 13:49
  • @RealSkeptic: Interesting - that gives *more* names, but still not all of them, and still only from 1970 onwards. More importantly, your comment revealed the `TimeZoneNames` type which makes the code cleaner. It doesn't help me before 1970, but it's still nicer... – Jon Skeet Jul 25 '15 at 14:02
  • I think you have a lost cause here. To get the display name, it first calls `getMetaZoneID(String,long)`. This, in turn, calls `com.ibm.icu.impl.TimeZoneNamesImpl._getMetaZoneID(String,long)` which in turn calls [this thing](http://grepcode.com/file/repo1.maven.org/maven2/com.ibm.icu/icu4j/54.1.1/com/ibm/icu/impl/TimeZoneNamesImpl.java?av=h#TimeZoneNamesImpl.TZ2MZsCache) where you can see January 1, 1970 hard-coded as the `from` point. – RealSkeptic Jul 25 '15 at 17:37
  • @RealSkeptic: Humbug. Thanks a lot for looking! – Jon Skeet Jul 25 '15 at 17:38
  • @RealSkeptic: It sounds like you've basically got the answer there - feel free to add it as such, and I'll accept it... with regret, obviously :( – Jon Skeet Jul 25 '15 at 19:25

1 Answers1

15

Tracing through the sources to see how this works, it turns out that to find the display name, it gets the name of the meta zone from the zone name and the date, and then, from the meta zone and the type, the display name.

com.ibm.icu.impl.TZDBTimeZoneNames, which is the class returned from TimeZoneNames.getTZDBInstance(ULocale), implements getMetaZoneID(String,Long) by calling com.ibm.icu.impl.TimeZoneNamesImpl._getMetaZoneID(String,long), which retrieves the mappings from the given time zone name to meta zone names, and then checks if the date is between the from and to parameters in any of those mappings.

The mapping is read by a nested class, like this:

for (int idx = 0; idx < zoneBundle.getSize(); idx++) {
    UResourceBundle mz = zoneBundle.get(idx);
    String mzid = mz.getString(0);
    String fromStr = "1970-01-01 00:00";
    String toStr = "9999-12-31 23:59";
    if (mz.getSize() == 3) {
        fromStr = mz.getString(1);
        toStr = mz.getString(2);
    }
    long from, to;
    from = parseDate(fromStr);
    to = parseDate(toStr);
    mzMaps.add(new MZMapEntry(mzid, from, to));
}

(source)

As you can see, it has hard-coded values for the to and from values it will return (although it reads the to and from from the resource bundle itself when the meta zone entry has three items, most of them don't - as can be seen in the actual meta zone file from which the bundle is built - and those who do, also do not have 'from' dates before January 1970.)

Thus, the meta zone ID will be null for any date before January 1970, and in turn, so will the display name.

A.L
  • 10,259
  • 10
  • 67
  • 98
RealSkeptic
  • 33,993
  • 7
  • 53
  • 79