1

I have these code below:

private void convertStringDatePosted() {
    String stringDatePosted = "20 Janeiro 2021 00:26";
    Locale locale = new Locale("pt", "MZ");
    DateTimeFormatter formatter = DateTimeFormatter.ofPattern("d MMMM yyyy HH:mm").withLocale(locale);
    LocalDateTime aa = LocalDateTime.parse(stringDatePosted, formatter);
    System.out.println(aa);
}

When I ran code with java 8, it work. But with java 11, it throw this Exception: java.time.format.DateTimeParseException: Text '20 Janeiro 2021 00:26' could not be parsed at index 3.

I also have a similar situation with the case where there is a zone in the string I want to convert, the code is:

protected LocalDateTime getDatePosted() {
    String dateScraped = "2021-08-15 09:00:28 (UTC+01:00)";
    DateTimeFormatter formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss (z)");
    return LocalDateTime.parse(dateScraped, formatter);
}

If run with java 8, it still work but with java 11 it not. The Exception is: java.time.format.DateTimeParseException: Text '20 Janeiro 2021 00:26' could not be parsed at index 24.

This error occur on macOS, i have not try on other OS yet.

I don't know if this error is due to java version or something else. Looking forward to your reply. Thanks a lot!

  • 3
    From a quick google, you'll find that in Portuguese, the names of the months are spelled in lowercase. See [this](https://ielanguages.com/portuguese-months.html) for example. So the expected way to write the date is "20 janeiro 2021 00:26" – Sweeper Aug 17 '21 at 04:20
  • Possible duplicate: https://stackoverflow.com/questions/10797808/how-to-parse-case-insensitive-strings-with-jsr310-datetimeformatter – Sweeper Aug 17 '21 at 04:26
  • Thanks a lot for your support, my code worked. But I don't know the difference between java 11 and java 8, why does it work in one version and not the other? – Nguyên Quang Lê Aug 17 '21 at 04:37
  • The reason why it worked in Java 8 was probably a bug. By [searching for "portuguese" in the bugs database](https://www.oracle.com/search/results/_/N-8d?No=0&Nr=106&Nrpp=10&Ntk=S3&Ntt=portuguese&cat=bugs), there are quite a lot of similar bugs related to not outputting the month name in the correct case. This specific bug is not reported though. I would imagine it got fixed as the result of fixing another bug, or someone just fixed it directly without reporting it. – Sweeper Aug 17 '21 at 04:38
  • Oh, that's useful information. Really thank you very much. Have a nice day. – Nguyên Quang Lê Aug 17 '21 at 04:48
  • Default locale data (including month names in different languages) are different in Java 8 and 11. See for example [JDK dateformatter parsing DayOfWeek in German locale, java8 vs java9](https://stackoverflow.com/questions/46244724/jdk-dateformatter-parsing-dayofweek-in-german-locale-java8-vs-java9) – Ole V.V. Aug 17 '21 at 04:59
  • I cannot reproduce your the second of your examples. Your `getDatePosted()` runs fine on my Oracle jdk-11.0.3 on Mac and returns `2021-08-15T09:00:28`. – Ole V.V. Aug 17 '21 at 11:03
  • 1
    oh, this is so weird. I will try to find out the cause on my laptop. Thanks for the information you provided – Nguyên Quang Lê Aug 18 '21 at 04:14

1 Answers1

1

Capital or small J in janeiro?

It seems to me that there are two bugs: One in your string, which should have a small j in janeiro, and one in Java 8, which accepts the capital J that you had. Said having learned just a few words of Portuguese when I was in Portugal some decades ago

Month names in different languages are a part of the locale data. Java can get its locale data from up to four sources. The default in Java 8 is Java’s built-in locale data from since early versions of Java. From Java 9 the default is CLDR, Unicode Common Locale Data Repository. So in Java 8 Java’s own locale data have the upper case J. CLDR, which is also available in Java 8 through a system property, has a small j. An interesting observation is that when I instruct Java 11 to use Java’s own locale data, it also shows the small j. The error from Java 8 has been fixed.

There are many possible solutions. One is to instruct your formatter not to care about case when parsing:

    String stringDatePosted = "20 Janeiro 2021 00:26";
    Locale locale = new Locale("pt", "MZ");
    DateTimeFormatter formatter = new DateTimeFormatterBuilder()
            .parseCaseInsensitive()
            .appendPattern("d MMMM yyyy HH:mm")
            .toFormatter(locale);
    LocalDateTime aa = LocalDateTime.parse(stringDatePosted, formatter);
    System.out.println(aa);

Output is:

2021-01-20T00:26

Parsing the UTC offset

I have not been able to reproduce your other example. Your getDatePosted method runs fine on my Oracle jdk-11.0.3 on Mac and returns 2021-08-15T09:00:28. I don’t know why it did not on your computer.

There’s another problem with your code, though: You are parsing into a LocalDateTime thus throwing the time zone information away. Don’t do that. For example the strings 2021-08-15 09:00:28 (UTC-12:00) and 2021-08-15 09:00:28 (UTC+14:00) denote very different points in tome, 26 hours apart, but will be parsed into the same LocalDateTime. Parse into a ZonedDateTime to retain the time zone. If you need the time in a different time zone (such as your own), convert to a ZonedDateTime in that time zone. Do not use LocalDateTime if by any means you can avoid it.

Workaround: If you cannot get parsing of UTC+01:00 as a time zone to work, the following hack does it:

protected OffsetDateTime getDatePosted() {
    String dateScraped = "2021-08-15 09:00:28 (UTC+01:00)";
    DateTimeFormatter formatter = DateTimeFormatter
            .ofPattern("yyyy-MM-dd HH:mm:ss ('UTC'xxx)", Locale.ROOT);
    return OffsetDateTime.parse(dateScraped, formatter);
}

Now the method returns 2021-08-15T09:00:28+01:00. You also notice that the UTC offset has been retained.

Link

Related question with more information on locale data in different Java versions: JDK dateformatter parsing DayOfWeek in German locale, java8 vs java9

Ole V.V.
  • 81,772
  • 15
  • 137
  • 161