51

I have code and a test-case in a legacy application, which can be summarized as follows:

@Test
public void testParseDate() throws ParseException {
    String toParse = "Mo Aug 18 11:25:26 MESZ +0200 2014";
    String pattern = "EEE MMM dd HH:mm:ss z Z yyyy";

    DateFormat dateFormatter = new SimpleDateFormat(pattern, Locale.GERMANY);
    Date date = dateFormatter.parse(toParse);

    //skipped assumptions
}

This test passes in Java 8 and below. However with Java 10 upwards this leads to a java.text.ParseException: Unparseable date: "Mo Aug 18 11:25:26 MESZ +0200 2014".

For the record: Besides de_DE, the exception is also thrown for the locales de_CH, de_AT, de_LU.

I am aware of the fact, that Date formatting was changed with JDK 9 (JEP 252). However, I consider this to be a disruptive change breaking backwards compatibility. Excerpted:

In JDK 9, the Unicode Consortium's Common Locale Data Repository (CLDR) data is enabled as the default locale data, so that you can use standard locale data without any further action.

In JDK 8, although CLDR locale data is bundled with the JRE, it isn’t enabled by default.

Code that uses locale-sensitive services such as date, time, and number formatting may produce different results with the CLDR locale data.

Adding a . for the day of the week (Mo.) compensates for this and test would pass. However, this is not a real solution for old data (in serialized form such as XML).

Checking this stackoverflow post, it seems that the behaviour is intentional for the German locale and can be mitigated by specifying java.locale.providers with COMPAT mode. However, I do not like the idea to rely on some system property value for two reasons as it might:

  1. change in the next releases of the JDK.
  2. be forgotten in different environments.

My question is:

  • How can I maintain backwards compatibility of legacy code with this particular date pattern, without re-writing / modifying existing serialized data or adding / changing system properties (like java.locale.providers), which may be forgotten in different environments (application servers, standalone jars, ...) ?
Community
  • 1
  • 1
rzo1
  • 5,561
  • 3
  • 25
  • 64
  • 1
    With an ugly workaround perhaps: intercept the call and check/modify the data before passing it on ? – Marged May 18 '18 at 12:29
  • Would be an option - but I do not like ugly workarounds. I prefer clean code ;) - consider the fact, that the write / read / parse process of the old data is done via 3rd party library, so we would have to modify this code. – rzo1 May 18 '18 at 12:34
  • 8
    You can set the system property from within Java: `System.setProperty("java.locale.providers", "COMPAT,CLDR");`. This will prevent it being forgot in any environment. It still won’t guarantee anything for Java 11 and beyond, of course. You may want to consider a project that converts all your old date-time data to [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) (that appears to be fairly future-proof): – Ole V.V. May 18 '18 at 12:43
  • 2
    One could change EEE into EE, but that could turn ugly for other locales. And probably you want some leniency, both Mo and Mon. – Joop Eggen May 18 '18 at 12:55
  • @OleV.V. setting the property as you suggested does not solve the issue for the test case found in the OP (in Java10). – MWiesner May 18 '18 at 13:02
  • 2
    @JoopEggen `EE MMM dd HH:mm:ss z Z yyyy` does not work. It leads to `java.text.ParseException: Unparseable date: "Mo Aug 18 11:25:26 MESZ +0200 2014"` – rzo1 May 18 '18 at 13:07
  • @MWiesner With `System.setProperty("java.locale.providers", "COMPAT,CLDR");` the code from the question runs nicely both on my JDK 9.0.4 and on my JDK 10.0.1. – Ole V.V. May 18 '18 at 13:10
  • @rzo then sorry for the extra work. – Joop Eggen May 18 '18 at 13:12
  • 1
    @OleV.V. Which version of the JDK do you use for this? Oracle's JDK or the plain(er) OpenJDK? To be precise: I used OpenJDK, Mac OS 10.13. – MWiesner May 18 '18 at 13:13
  • @MWiesner Oracle. On MacOS Sierra 10.12.6. – Ole V.V. May 18 '18 at 13:17
  • @OleV.V.Hmm... I tried `System.setProperty("java.locale.providers", "COMPAT,CLDR");` with OpenJDK 10 on Windows 10, which did not work... – rzo1 May 18 '18 at 13:19
  • @OleV.V. Could you try/verify this with an OpenJDK instead? It seems, there is a fundamental difference of the effect of this property in both environments. – MWiesner May 18 '18 at 13:28
  • @MWiesner rzo has already verified. I trust the two of you. – Ole V.V. May 18 '18 at 14:59
  • 2
    @rzo I am confused as to why your are suffering from a compatibility problem yet refuse to use the [compatibility solution](https://docs.oracle.com/javase/9/intl/internationalization-enhancements-jdk-9.htm#JSINT-GUID-9DCDB41C-A989-4220-8140-DBFB844A0FCA) provided by Oracle expressly as a solution: `java.locale.providers` with `COMPAT` or the [`java.util.spi.LocaleServiceProvider`](https://docs.oracle.com/javase/9/docs/api/java/util/spi/LocaleServiceProvider.html) API? – Basil Bourque May 19 '18 at 00:02
  • 1
    Sane programmers don’t use localized formatted strings as persistent storage form. The localized format can change [for](https://en.wikipedia.org/wiki/Capital_%E1%BA%9E) [various](https://en.wikipedia.org/wiki/Spelling_reform#German) [reasons](https://en.wikipedia.org/wiki/Calendar_reform) and the switch to CLDR is only one of them. Maintaining compatibility to legacy code is easy, when you know the format it uses, the question is whether you want to keep basing your persistence on a format that can change again tomorrow… – Holger Jun 08 '18 at 08:28
  • @Holger Of course, I can go and change the (legacy) format in every resource of the last > 7 years, which will cost us a lot of work. New generated artifacts will be compliant with the last changes and will not rely on this (legacy) way to persist dates - that is not the problem. But you are right - maybe we just stick to the workaround to comply with the old format and for the new generated resources it will not be a problem at all as it will not rely on localized formatted strings. – rzo1 Jun 08 '18 at 09:04
  • 1
    Well, I didn’t mean to change the files (now), but to ensure that no new files relying on localized formats are created. The import (and even export, if legacy systems are still in use) works via nasty workarounds with fixed formats, you only have to ensure that the list of legacy formats to maintain doesn’t grow. Which is what step one is all about. Once all legacy systems phased out, you may consider converting all remaining files; that process can be automated and use the same import routines you use now. – Holger Jun 08 '18 at 09:14
  • I like this suggestion @Holger - thx. – rzo1 Jun 08 '18 at 09:22
  • 1
    You *must* set the `java.locale.providers` property on the command line. According to the [Javadocs](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/spi/LocaleServiceProvider.html): "The search order of locale sensitive services can be configured by using the "`java.locale.providers`" system property. This system property declares the user's preferred order for looking up the locale sensitive services separated by a comma. It is only read at the Java runtime startup, so the later call to `System.setProperty()` won't affect the order." – Joep Weijers Jan 15 '19 at 15:16

4 Answers4

22

I don’t say it’s a nice solution, but it seems to be a way through.

    Map<Long, String> dayOfWeekTexts = Map.of(1L, "Mo", 2L, "Di", 
            3L, "Mi", 4L, "Do", 5L, "Fr", 6L, "Sa", 7L, "So");
    Map<Long, String> monthTexts = Map.ofEntries(Map.entry(1L, "Jan"), 
            Map.entry(2L, "Feb"), Map.entry(3L, "Mär"), Map.entry(4L, "Apr"),
            Map.entry(5L, "Mai"), Map.entry(6L, "Jun"), Map.entry(7L, "Jul"),
            Map.entry(8L, "Aug"), Map.entry(9L, "Sep"), Map.entry(10L, "Okt"),
            Map.entry(11L, "Nov"), Map.entry(12L, "Dez"));

    DateTimeFormatter formatter = new DateTimeFormatterBuilder()
            .appendText(ChronoField.DAY_OF_WEEK, dayOfWeekTexts)
            .appendLiteral(' ')
            .appendText(ChronoField.MONTH_OF_YEAR, monthTexts)
            .appendPattern(" dd HH:mm:ss z Z yyyy")
            .toFormatter(Locale.GERMANY);

    String toParse = "Mo Aug 18 11:25:26 MESZ +0200 2014";
    OffsetDateTime odt = OffsetDateTime.parse(toParse, formatter);
    System.out.println(odt);
    ZonedDateTime zdt = ZonedDateTime.parse(toParse, formatter);
    System.out.println(zdt);

Output running on my Oracle JDK 10.0.1:

2014-08-18T11:25:26+02:00
2014-08-18T11:25:26+02:00[Europe/Berlin]

Then again, no nice solution may exist.

java.time, the modern Java date and time API, allows us to specify texts to use for fields for both formatting and parsing. So I exploit that for both day of week and for month, specifying the abbreviations without dot that were used with the old COMPAT or JRE locale data. I have used the Java 9 Map.of and Map.ofEntries for building the maps we need. If this is to work in Java 8 too, you must find some other way to populate the two maps, I trust you to do that.

If you do need an old-fashioned java.util.Date (likely in a legacy code base), convert like this:

    Date date = Date.from(odt.toInstant());
    System.out.println("As legacy Date: " + date);

Output in my time zone (Europe/Copenhagen, probably roughly agrees with yours):

As legacy Date: Mon Aug 18 11:25:26 CEST 2014

Suggestion for a strategy

I am thinking that if that were me, I’d consider proceeding this way:

  1. Wait. Set the relevant system property from within Java: System.setProperty("java.locale.providers", "COMPAT,CLDR"); so it won’t be forgot in any environment. The COMPAT locale data have been around since 1.0 (I believe, at least close), so a lot of code out there depends on it (not only yours). The name was changed from JRE to COMPAT in Java 9. To me this may sound like a plan to keep the data around for quite a while still. According to the early access documentation it will still be available in Java 11 (the next “long term support” Java version) and with no deprecation warning or the like. And should it be removed in some future Java version, you will probably be able to find out soon enough that you can deal with the problem before upgrading.
  2. Use my solution above.
  3. Use the locale service provider interface that Basil Bourque linked to. There is no doubt that this is the nice solution in case the COMPAT data should be removed some unknown time in the future. You may even be able to copy the COMPAT locale data into your own files so they can’t take them away from you, only check if there are copyright issues before you do so. The reason why I mention the nice solution last is you said you aren’t happy with having to set a system property in every possible environment where your program may run. As far as I can tell, using your own locale data through the locale service provider interface will still require you to set the same system property (only to a different value).
Ole V.V.
  • 81,772
  • 15
  • 137
  • 161
  • 3
    "Mar" would be rather written as "Mär" (or sometimes even as "Mrz"). – Meno Hochschild May 18 '18 at 16:01
  • 1
    Danke so sehr, @Meno. You can believe me or not, but I corrected before reading your comment (and also deleted the “please check my German spelling”; you can still do that though, one never knows if there are more mistakes). – Ole V.V. May 18 '18 at 17:54
3

Just to mention: SimpleDateFormat is an old way to format dates which BTW is not thread safe. Since Java 8 there are new packages called java.time and java.time.format and you should use those to work with dates. For your purposes you should use class DateTimeFormatter.

Michael Gantman
  • 7,315
  • 2
  • 19
  • 36
  • This is true, but this is a huge legacy code-base. However, the question is about the differences in formatting between Java 8 and Java 10 using the same legacy class. – rzo1 Jun 14 '18 at 16:10
  • 1
    Yes, I understood this, and I understand your problem with legacy code and backwards compatability. I just thought I mentioed this in case it could be an option. – Michael Gantman Jun 14 '18 at 17:00
  • This seems to be an option for my case. Thx for the hint. – mondjunge Jan 16 '19 at 16:28
  • DateTimeFormatter has the same issue. Tested it today with Java 19.0.2 – ChrLipp Feb 05 '23 at 19:08
0

The formatted value in java 8 was Fr Juni 15 00:20:21 MESZ +0900 2018 But it changed to Fr. Juni 15 00:20:21 MESZ +0900 2018 EEE includes . THIS IS COMPATIBILITY ISSUE and it does not matter that older versions of code do not work in newer versions.(Sorry for translator) If date string is yours, you should add dot for new version users. Or make users use Java 8 to use your software.

It can make the software slower, using substring method is also good.

    String toParse = "Mo Aug 18 11:25:26 MESZ +0200 2014";
    String str = toParse.substring(0, 2) + "." + toParse.substring(2);
    String pattern = "EEE MMM dd HH:mm:ss z Z yyyy";

    DateFormat dateFormatter = new SimpleDateFormat(pattern, Locale.GERMANY);
    System.out.println(dateFormatter.format(System.currentTimeMillis()));
    Date date = dateFormatter.parse(str);

Sorry again for my bad English.

dhkim0800
  • 101
  • 8
0

Here is a working but ugly workaround for this. It is ugly because you have to redefine all words in an own map, but you still have all the benefits of the efficient and flexible default parser.

String dateString = "Mi Mai 09 09:17:24 2018";

Map<Long, String> dayOfWeekTexts =
    Map.of(1L, "Mo", 2L, "Di", 3L, "Mi", 4L, "Do", 5L, "Fr", 6L, "Sa", 7L, "So");
Map<Long, String> monthTexts =
    Map.ofEntries(
        Map.entry(1L, "Jan"),
        Map.entry(2L, "Feb"),
        Map.entry(3L, "Mär"),
        Map.entry(4L, "Apr"),
        Map.entry(5L, "Mai"),
        Map.entry(6L, "Jun"),
        Map.entry(7L, "Jul"),
        Map.entry(8L, "Aug"),
        Map.entry(9L, "Sep"),
        Map.entry(10L, "Okt"),
        Map.entry(11L, "Nov"),
        Map.entry(12L, "Dez"));

DateTimeFormatter dtf =
    new DateTimeFormatterBuilder()
        .appendText(ChronoField.DAY_OF_WEEK, dayOfWeekTexts)
        .appendLiteral(' ')
        .appendText(ChronoField.MONTH_OF_YEAR, monthTexts)
        .appendPattern(" dd HH:mm:ss yyyy")
        .toFormatter(Locale.GERMAN);

LocalDateTime dateTime = LocalDateTime.parse(dateString, dtf);

This is only a slightly modified answer from https://stackoverflow.com/a/50412644/1353930

Daniel Alder
  • 5,031
  • 2
  • 45
  • 55