2

My previous question was closed as duplicate of Which three-letter time zone IDs are not deprecated? but I believe this is something completely different.
The other question is about the java.util.TimeZone class.
My question is about the java.text.SimpleDateFormat class.

As I already mentioned in the original question, a simple test revealed that these 2 classes do not support the same timezone abbreviations. For example:

  • TimeZone supports "CTT" but SimpleDateFormat does not
  • SimpleDateFormat supports "CEST" but TimeZone does not

Where (or how) can I find a full list of abbreviations that SimpleDateFormat is able to parse when using "z" in the parse string, and what each abbreviation is considered to mean?


Original question for reference:

[Context: I'm not a developer but need to document existing code]

I understand that there are no standards for timezone abbreviations and so CST can mean Central Standard Time, China Standard Time, Cuba Standard Time, ...

But if I have code like this:

        String time = "12:00:00.000 CST Tue Dec 17 2019";
        SimpleDateFormat sdf = new SimpleDateFormat("HH:mm:ss.SSS zzz EEE MMM dd yyyy");
        Date utcTime = sdf.parse(time);
        System.out.println(utcTime.toGMTString());

Then the result is:

        17 Dec 2019 18:00:00 GMT

So this means that SimpleDateFormat.parse() interprets CST as Central Standard Time / UTC-6.

How/where can I get a full list of timezone abbreviations that SimpleDateFormat.parse() supports AND their meaning (i.e. just knowing that CST is supported is not enough, I need to know that this is interpreted as Central Standard Time, not Cuba Standard Time).


I would expect to find the answer on the Javadocs page for SimpleDateFormat but it only gives PST as example, not a full list.


The Javadocs page for TimeZone says :

For compatibility with JDK 1.1.x, some other three-letter time zone IDs (such as "PST", "CTT", "AST") are also supported.

But doesn't say which ones.
It also only mentions three-letter IDs, while the code above will also work fine with CEST for example (Central European Summer Time).

If I check the list returned by TimeZone.getAvailableIDs(); it does not contain CEST either, but it does contain items like "Cuba" and "Eire", which do not work in the above code snippet (even when I change zzz to zzzz in the parse string).

So I conclude that java.text.SimpleDateFormat does not support the same abbreviations as java.util.TimeZone.

Where can I find a list of abbreviations that SimpleDateFormat is able to parse?

Anderson Choi
  • 353
  • 1
  • 4
  • 25
hertitu
  • 141
  • 4
  • 1
    If you can, get rid of `SimpleDateFormat`. That class is notoriously troublesome and long outdated. Instead use `DateTimeFormatter` and other classes from [java.time, the modern Java date and time API](https://docs.oracle.com/javase/tutorial/datetime/). – Ole V.V. Dec 18 '19 at 05:10
  • 1
    You *really* need to let go of `SimpleDateFormat` and related classes (`Date`, `Calendar`, `TimeZone`). These legacy classes are *terrible*, built by people who did not understand the complexities and subtleties of date-time handling. They were replaced for good reasons with the adoption of [JSR 310](https://jcp.org/en/jsr/detail?id=310). Use only classes from the *java.time* packages. Sun, Oracle, and the JCP community all gave up on those legacy classes, and so should you. – Basil Bourque Dec 18 '19 at 08:35
  • @BasilBourque and Ole: thank you for the tips, FWIW I agree and that is certainly something I will discuss with the developent team for the next version of the product. But right now, I have to create a document for users of the current version to tell them which abbreviations will work. – hertitu Dec 18 '19 at 12:25

3 Answers3

4

It sounds like a simple question, but the answer is complicated. The time zone abbreviations supported by Java (whether we are talking the modern DateTimeFormatter or the troublesome and outdated SimpleDateFormat) are not defined by Java itself but by the locale data that Java uses. Complications include

  • Java can be configured to take locale data from different sources. This means that setting the system property java.locale.providers when you run your program will change the set of abbreviations supported.
  • The default locale providers were changed from Java 8 to Java 9. So running your program on a different Java version will give you a different set of supported time zone abbreviations.
  • CLDR, the locale data that are the default from Java 9, come in versions. So every new Java version may come with a different set of supported abbreviations.
  • Time zone abbreviations come in languages. So a formatter with French locale will support other abbreviations than a formatter with German locale, for example.
  • Time zone abbreviations are ambiguous. So even if PST is a supported abbreviation, you don’t know whether it parses into Pitcairn Standard Time, Pacific Standard Time or Philippine Standard Time.

One way to get an idea: Run a program formatting dates at different times of year into a time zone abbreviation. Since many time zones use summer time (DST) and use a different abbreviation during summer time, you will want to try to hit a date in the summer and a date in the standard time of year for those zones.

    System.out.println("java.locale.providers: " + System.getProperty("java.locale.providers"));
    DateTimeFormatter zoneAbbreviationFormatter = DateTimeFormatter.ofPattern("zzz", Locale.FRENCH);
    Instant northernSummer = Instant.parse("2019-07-01T00:00:00Z");
    Instant southernSummer = Instant.parse("2020-01-01T00:00:00Z");

    Set<String> supportedAbbreviations = new TreeSet<>();
    for (String zid : ZoneId.getAvailableZoneIds()) {
        ZoneId zone = ZoneId.of(zid);
        supportedAbbreviations.add(northernSummer.atZone(zone).format(zoneAbbreviationFormatter));
        supportedAbbreviations.add(southernSummer.atZone(zone).format(zoneAbbreviationFormatter));
    }

    System.out.println("" + supportedAbbreviations.size() + " abbreviations");
    System.out.println(supportedAbbreviations);

Output on my Java 9 was (scroll right to see it all):

java.locale.providers: null
205 abbreviations
[ACDT, ACST, ACT, ACWST, ADT, AEDT, AEST, AFT, AKDT, AKST, ALMT, AMST, AMT, AQTT, ART, AST, AWST, AZOST, AZOT, AZT, America/Punta_Arenas, Asia/Atyrau, Asia/Barnaul, Asia/Famagusta, Asia/Tomsk, BDT, BNT, BOT, BRST, BRT, BST, BTT, CAT, CCT, CDT, CEST, CET, CHADT, CHAST, CHOT, CHUT, CKT, CLST, CLT, COT, CST, CVT, CXT, ChST, DAVT, DDUT, EASST, EAST, EAT, ECT, EDT, EEST, EET, EGST, EGT, EST, Etc/GMT+1, Etc/GMT+10, Etc/GMT+11, Etc/GMT+12, Etc/GMT+2, Etc/GMT+3, Etc/GMT+4, Etc/GMT+5, Etc/GMT+6, Etc/GMT+7, Etc/GMT+8, Etc/GMT+9, Etc/GMT-1, Etc/GMT-10, Etc/GMT-11, Etc/GMT-12, Etc/GMT-13, Etc/GMT-14, Etc/GMT-2, Etc/GMT-3, Etc/GMT-4, Etc/GMT-5, Etc/GMT-6, Etc/GMT-7, Etc/GMT-8, Etc/GMT-9, Europe/Astrakhan, Europe/Kirov, Europe/Saratov, Europe/Ulyanovsk, FJST, FJT, FKT, FNT, GALT, GAMT, GET, GFT, GILT, GMT, GST, GYT, HADT, HAST, HDT, HKT, HOVT, HST, ICT, IDT, IOT, IRDT, IRKT, IRST, IST, JST, KGT, KOST, KRAT, KST, LHDT, LHST, LINT, MAGT, MART, MAWT, MDT, MEST, MET, MHT, MIST, MMT, MSK, MST, MUT, MVT, MYT, NCT, NDT, NFT, NOVT, NPT, NRT, NST, NUT, NZDT, NZST, OMST, PDT, PET, PETT, PGT, PHOT, PHT, PKT, PMDT, PMST, PONT, PST, PWT, PYST, PYT, RET, ROTT, SAKT, SAMT, SAST, SBT, SCT, SGT, SRET, SRT, SST, SYOT, TAHT, TFT, TJT, TKT, TLT, TMT, TOT, TVT, ULAT, UTC, UYT, UZT, VET, VLAT, VOST, VUT, WAKT, WEST, WET, WFT, WGST, WGT, WIB, WIT, WITA, WSDT, WSST, XJT, YAKT, YEKT]

Edit:

You may modify the code snippet to produce the time zones that each abbreviation may be parsed to too.

I would certainly expect that the formatter can parse the same time zone abbreviations that it can produce by formatting.

Supplement: Code for getting each abbreviation with the time zones that it may parse into:

    Set<String> zids = ZoneId.getAvailableZoneIds();
    Map <String, List<String>> supportedAbbreviations = new TreeMap<>();
    supportedAbbreviations.putAll(zids.stream()
            .collect(Collectors.groupingBy(zid -> northernSummer.atZone(ZoneId.of(zid))
                    .format(zoneAbbreviationFormatter))));
    supportedAbbreviations.putAll(zids.stream()
            .collect(Collectors.groupingBy(zid -> southernSummer.atZone(ZoneId.of(zid))
                    .format(zoneAbbreviationFormatter))));

    System.out.println("" + supportedAbbreviations.size() + " abbreviations");
    supportedAbbreviations.forEach((a, zs) -> System.out.format("%-5s %s%n", a, zs));

Excerpt from the output (still on my Java 9):

205 abbreviations
ACDT  [Australia/Yancowinna, Australia/Adelaide, Australia/Broken_Hill, Australia/South]
ACST  [Australia/North, Australia/Darwin]
ACT   [America/Eirunepe, America/Porto_Acre, Brazil/Acre, America/Rio_Branco]
ACWST [Australia/Eucla]
ADT   [Canada/Atlantic, America/Grand_Turk, America/Moncton, Atlantic/Bermuda, America/Halifax, America/Glace_Bay, America/Thule, America/Goose_Bay, SystemV/AST4ADT]
AEDT  [Australia/Hobart, Australia/Tasmania, Australia/ACT, Australia/Victoria, Australia/Canberra, Australia/Currie, Australia/NSW, Australia/Sydney, Australia/Melbourne]
AEST  [Australia/Queensland, Australia/Brisbane, Australia/Lindeman]
…
WSDT  [Pacific/Apia]
WSST  [Pacific/Apia]
XJT   [Asia/Kashgar, Asia/Urumqi]
YAKT  [Asia/Chita, Asia/Yakutsk, Asia/Khandyga]
YEKT  [Asia/Yekaterinburg]
Ole V.V.
  • 81,772
  • 15
  • 137
  • 161
  • Indeed not the kind of answer I was hoping for, but nevertheless very useful, thank you for taking the time to explain in detail, I do appreciate that. Note that my question is about parsing, not formatting, but I guess I could try and write something similar for SimpleDateFormat.parse(). I suppose I could just use brute force and try parsing everything from AAA to ZZZ and AAAA to ZZZZ and list which ones don't return null. – hertitu Dec 18 '19 at 12:38
  • @hertitu I would certainly expect the formatter to parse the same time zone abbreviations that it can produce by formatting, and no others. But do go ahead and verify if you need to be sure. – Ole V.V. Dec 18 '19 at 16:48
  • `You may modify the code snippet to produce the time zones that each abbreviation may be parsed to too.` is easier said than done (for me at least). – hertitu Dec 18 '19 at 23:39
  • @hertitu I understand. I have added a code example as a supplement at the bottom of my answer. – Ole V.V. Dec 19 '19 at 19:14
1

By reading the source code of SimpleDateFormat class, the main method for parsing the zone is the method subParseZoneString, as below (source code below comes from jdk1.8.0_151, omitted some part)

/**
 * find time zone 'text' matched zoneStrings and set to internal
 * calendar.
 */
private int subParseZoneString(String text, int start, CalendarBuilder calb) {
    boolean useSameName = false; // true if standard and daylight time use the same abbreviation.
    TimeZone currentTimeZone = getTimeZone();

    // At this point, check for named time zones by looking through
    // the locale data from the TimeZoneNames strings.
    // Want to be able to parse both short and long forms.
    int zoneIndex = formatData.getZoneIndex(currentTimeZone.getID());
    TimeZone tz = null;
    String[][] zoneStrings = formatData.getZoneStringsWrapper();
    String[] zoneNames = null;
    int nameIndex = 0;
    if (zoneIndex != -1) {
        zoneNames = zoneStrings[zoneIndex];
        if ((nameIndex = matchZoneString(text, start, zoneNames)) > 0) {
            if (nameIndex <= 2) {
                // Check if the standard name (abbr) and the daylight name are the same.
                useSameName = zoneNames[nameIndex].equalsIgnoreCase(zoneNames[nameIndex + 2]);
            }
            tz = TimeZone.getTimeZone(zoneNames[0]);
        }
    }
    ...
    // If fail, try zoneIndex by TimeZone.getDefault().getId(), omitted
    ...
    // If fail, try all elements in zoneStrings, omitted
    ...
    // Handle Daylight Saving, omitted
    ...
}  

From the code, we see that zoneStrings will be the main source for matching against the input text. Each element zoneString in zoneStrings is a string array representing one zone information .For the source, the matching priority is:

  1. The zoneString of the SimpleDateFormat time zone.
  2. The zoneString of the default time zone (TimeZone.getDefault()).
  3. The zoneString in the array order.

By tracing the method call, in DateFormatSymbols class,

/**
 * Wrapper method to the getZoneStrings(), which is called from inside
 * the java.text package and not to mutate the returned arrays, so that
 * it does not need to create a defensive copy.
 */
final String[][] getZoneStringsWrapper() {
    if (isSubclassObject()) {
        return getZoneStrings();
    } else {
        return getZoneStringsImpl(false);
    }
}

private String[][] getZoneStringsImpl(boolean needsCopy) {
    if (zoneStrings == null) {
        zoneStrings = TimeZoneNameUtility.getZoneStrings(locale);
    }
    ...
    // do array copying, omitted   
    ...
}  

The zoneStrings comes from TimeZoneNameUtility class method, which is internal APIs. After some debugging, it is found that the element of zoneStrings is a String array with length 7.

  • Index 0: Zone Id(TimeZone.getID())
  • Index 1: Zone Long Name
  • Index 2: Zone Abbreviation
  • Index 3: Zone Long Name(For Daylight Saving)
  • Index 4: Zone Abbreviation(For Daylight Saving)
  • Index 5,6: Don't know

From the matchZoneString method of SimpleDateFormat class,

private int matchZoneString(String text, int start, String[] zoneNames) {
    for (int i = 1; i <= 4; ++i) {
        // Checking long and short zones [1 & 2],
        // and long and short daylight [3 & 4].
        String zoneName = zoneNames[i];
        if (text.regionMatches(true, start,
                               zoneName, 0, zoneName.length())) {
            return i;
        }
    }
    return -1;
}

The text input will only match element of zoneStrings from index 1 to 4. Zone ID is not involved in this case.


TL;DR

So now we can explain why

TimeZone supports "CTT" but SimpleDateFormat does not
SimpleDateFormat supports "CEST" but TimeZone does not

"CTT" is a Zone ID but not an abbreviation.
"CEST" is an abbreviation but not a Zone ID.

How/where can I get a full list of timezone abbreviations that SimpleDateFormat.parse() supports AND their meaning (i.e. just knowing that CST is supported is not enough, I need to know that this is interpreted as Central Standard Time, not Cuba Standard Time).

We can try to retrieve all zone abbreviations that can be parsed by SimpleDateFormat and their corresponding Zone ID and Long Name, using the below program by following the matching priority of SimpleDateFormat.

import java.text.DateFormatSymbols;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.ArrayList;
import java.util.Collections;
import java.util.HashSet;
import java.util.List;
import java.util.Locale;
import java.util.Set;
import java.util.TimeZone;

public class GetParsableZone {

    public static void main(String[] args) throws ParseException {
        Locale locale = Locale.getDefault(Locale.Category.FORMAT);
        System.out.println("Default Local for getZoneStrings " + locale.toString());
        DateFormatSymbols dateFormatSymbols = new DateFormatSymbols(locale);
        String[][] zoneStrings = dateFormatSymbols.getZoneStrings();
        Set<String> possibleZoneAbbrs = new HashSet<String>();
        for (String[] zoneString : zoneStrings) {
            possibleZoneAbbrs.add(zoneString[2]);
            possibleZoneAbbrs.add(zoneString[4]);
        }
        System.out.println("Try to parse all possibleZoneAbbrs");
        SimpleDateFormat dateFormatWithZone = new SimpleDateFormat("dd/MM/yyyy zzz");
        String template = "12/12/2012 %s";
        for (String possibleZoneAbbr : possibleZoneAbbrs) {
            dateFormatWithZone.parse(String.format(template, possibleZoneAbbr));
        }
        System.out.println("All possibleZoneAbbrs can be parsed!");
        List<String> orderedPossibleZoneAbbrs = new ArrayList<String>(possibleZoneAbbrs);
        Collections.sort(orderedPossibleZoneAbbrs);
        String simpleDateFormatZoneId = new SimpleDateFormat().getTimeZone().getID();
        String defaultZoneId = TimeZone.getDefault().getID();
        Integer simpleDateFormatZoneIndex = getZoneIndex(zoneStrings, simpleDateFormatZoneId);
        Integer defaultZoneIndex = getZoneIndex(zoneStrings, defaultZoneId);
        System.out.println("Default SimpleDateFormat Time Zone " + new SimpleDateFormat().getTimeZone().getID());
        System.out.println("Default Time Zone " + TimeZone.getDefault().getID());
        System.out.println("Abbreviation\tZoneId\tLong Name");
        for (String orderedPossibleZoneAbbr : orderedPossibleZoneAbbrs) {
            // Do matching as SimpleDateFormat
            int matchIndex = getZoneIndexMatchAbbreviation(zoneStrings, simpleDateFormatZoneIndex, defaultZoneIndex,
                    orderedPossibleZoneAbbr);
            printIdAndFullName(zoneStrings[matchIndex], orderedPossibleZoneAbbr);
        }
    }

    public static void printIdAndFullName(String[] zoneString, String abbreviation) {
        String longName = "";
        String id = zoneString[0];
        if (zoneString[2].equals(abbreviation)) {
            longName = zoneString[1];
        } else {
            longName = zoneString[3];
        }
        System.out.println(String.format("%s\t%s\t%s", abbreviation, id, longName));
    }

    public static final int getZoneIndex(String[][] zoneStrings, String ID) {
        for (int index = 0; index < zoneStrings.length; index++) {
            if (ID.equals(zoneStrings[index][0])) {
                return index;
            }
        }
        return -1;
    }

    public static boolean isAbbreviationMatchZoneString(String[] zoneString, String abbreviation) {
        return zoneString[2].equals(abbreviation) || zoneString[4].equals(abbreviation);
    }

    public static int getZoneIndexMatchAbbreviation(String[][] zoneStrings, int simpleDateFormatZoneIndex,
            int defaultZoneIndex, String abbreviation) {
        String[] simpleDateFormatZoneString = zoneStrings[simpleDateFormatZoneIndex];
        if (isAbbreviationMatchZoneString(simpleDateFormatZoneString, abbreviation)) {
            return simpleDateFormatZoneIndex;
        }
        String[] defaultZoneString = zoneStrings[defaultZoneIndex];
        if (isAbbreviationMatchZoneString(defaultZoneString, abbreviation)) {
            return defaultZoneIndex;
        }
        for (int i = 0; i < zoneStrings.length; i++) {
            if (isAbbreviationMatchZoneString(zoneStrings[i], abbreviation)) {
                return i;
            }
        }
        return -1;
    }
}

In my environment, the SimpleDateFormat zone and TimeZone.getDefault() are both Asia/Shanghai, so if the input is CST, the zone interpreted will be China Standard Time instead of Central Standard Time or Cuba Standard Time

samabcde
  • 6,988
  • 2
  • 25
  • 41
  • Ok this sounds promising, but... when it try to run it I get `error: package sun.util.locale.provider is not visible`. I did mention that I'm not a developer so I might be doing something stupid. Or maybe it's because I'm using OpenJDK ? – hertitu Dec 18 '19 at 16:49
  • # java --version Picked up JAVA_TOOL_OPTIONS: -XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 -XX:CICompilerCount=4 -Xmx512M -Dio.netty.eventLoopThreads=4 openjdk 11.0.2 2019-01-15 LTS OpenJDK Runtime Environment Zulu11.29+10-SA (build 11.0.2+7-LTS) OpenJDK 64-Bit Server VM Zulu11.29+10-SA (build 11.0.2+7-LTS, mixed mode) – hertitu Dec 18 '19 at 16:51
  • Updated the program, check if you can run. – samabcde Dec 18 '19 at 17:07
  • Thank you! This runs and produces a list of abbreviations - so this answers the first part of the question ("How/where can I get a full list of timezone abbreviations that SimpleDateFormat.parse() supports AND their meaning (i.e. just knowing that CST is supported is not enough, I need to know that this is interpreted as Central Standard Time, not Cuba Standard Time)). Any suggestion for the second part? – hertitu Dec 18 '19 at 23:27
  • Updated for second part. – samabcde Dec 19 '19 at 08:29
0

It can be done through zoneId.getDisplayName(SHORT, ENGLISH)