8

I'm trying to create a DateTimeFormatter to match the following example (it's actually slightly more complex than this but that shouldn't matter).

20180302-17:45:21

I've written the following but it results in an exception:

new DateTimeFormatterBuilder()
    .append(DateTimeFormatter.BASIC_ISO_DATE)
    .appendLiteral('-')
    .append(DateTimeFormatter.ISO_LOCAL_TIME)
    .toFormatter()
    .parse("20180302-17:45:21");

The exception is:

Exception in thread "main" java.time.format.DateTimeParseException: Text '20180302-17:45:21' could not be parsed at index 11
    at java.base/java.time.format.DateTimeFormatter.parseResolved0(DateTimeFormatter.java:1988)
    at java.base/java.time.format.DateTimeFormatter.parse(DateTimeFormatter.java:1816)

It appears to be failing on the colon between 17:45 and DateTimeFormatterBuilder.appendLiteral doesn't give any clues.

If I change the literal to another character, let's say m, then it works fine:

new DateTimeFormatterBuilder()
    .append(DateTimeFormatter.BASIC_ISO_DATE)
    .appendLiteral('m')
    .append(DateTimeFormatter.ISO_LOCAL_TIME)
    .toFormatter()
    .parse("20180302m17:45:21");

What's going on here? How can I fix it, assuming I can't change the format?

Comments suggest this might be version dependent. I'm using JDK 9.0.1 and it's been reproduced on 9.0.4.

Michael
  • 41,989
  • 11
  • 82
  • 128
  • 2
    umm, using your code, i get the output `{},ISO resolved to 2018-03-02T17:45:21` and not an error – XtremeBaumer Mar 05 '18 at 11:36
  • 1
    yeah that might be it. still using java 8 – XtremeBaumer Mar 05 '18 at 11:38
  • Are you sure the character you're appending is the same one as in the example snippet ([hyphen-minus](http://www.fileformat.info/info/unicode/char/2d/index.htm)), and not for example [en dash](http://www.fileformat.info/info/unicode/char/2013/index.htm)? – Mick Mnemonic Mar 05 '18 at 11:47
  • 1
    Can confirm, it works with Java 8 and fails with Java 9 (without recompiling). – Jorn Vernee Mar 05 '18 at 11:48
  • @MickMnemonic I'm certain. And anyway, the given index of the failure doesn't correspond to the position of the dash. – Michael Mar 05 '18 at 11:49
  • 1
    I have reproduced the exception on Java 9.0.4. On Java 1.8.0_131 I don’t see it. – Ole V.V. Mar 05 '18 at 12:01
  • 3
    `.append(DateTimeFormatter.BASIC_ISO_DATE)` seems to be part of the problem. If I replace it by `.appendPattern("uuuuMMdd")`, parsing works also on Java 9.0.4. – Ole V.V. Mar 05 '18 at 12:03
  • 4
    Maybe it has something to do with the [offset ID](https://docs.oracle.com/javase/8/docs/api/java/time/ZoneOffset.html#getId--), which expects a zone offset id in the format `+hhmm` or `-hhmm`. Therefore, an exception is thrown because BASIC_ISO_DATE tries to parse `-17:` as a zone offset id. – MC Emperor Mar 05 '18 at 12:10
  • Agreed, but the offset ID does have a colon. It is not the colon that is causing the problem, but the '4' after the colon. There is no time zone in the world which is offset from UTC by hh:4x. – DodgyCodeException Mar 05 '18 at 12:15
  • @DodgyCodeException It think there are countries in the world having an offset of *n* hours and 45 minutes difference with UTC. – MC Emperor Mar 05 '18 at 12:18
  • You're right. The documentation for [BASIC_ISO_DATE](https://docs.oracle.com/javase/9/docs/api/java/time/format/DateTimeFormatter.html#BASIC_ISO_DATE) says that it takes an offset and explicitly states "without colons". – DodgyCodeException Mar 05 '18 at 12:23
  • 3
    I get a strong feeling about this also being related to the [CLDR date-time patterns changes in JDK9](https://stackoverflow.com/a/46245412/1746118) – Naman Mar 05 '18 at 12:24
  • 1
    @Michael What if you replace `BASIC_ISO_DATE` with `ISO_LOCAL_DATE`? – MC Emperor Mar 05 '18 at 12:25
  • @MCEmperor That works but I'm parsing text I have no control over the format of. – Michael Mar 05 '18 at 12:47
  • @Michael So you only have control over the string passed to `parse` (e.g. `parse("20180302-17:45:21")`)? In that case, your only option is to add the zone offset to the date, such that the pattern is still valid: `20180302Z-17:45:21` – MC Emperor Mar 05 '18 at 13:05
  • 1
    @MCEmperor The other way around. I can't control the input. I've worked around the issue for now by using a similar solution to the `appendPattern("uuuuMMdd")` suggested by Ole – Michael Mar 05 '18 at 13:10
  • Thanks for your valuable comments elsewhere! – GhostCat Sep 19 '18 at 13:22

2 Answers2

9

This has got to do with the fact that DateTimeFormatter.BASIC_ISO_DATE includes an optional offset ID. Apparently your formatter parses -17 as an offset and then objects because there is a colon where the format requires a hyphen.

When you use m instead, this cannot be parsed as an offset and therefore matches the literal m in the format, and everything works.

I tried using uppercase Z. Z can be an offset ID too.

new DateTimeFormatterBuilder()
    .append(DateTimeFormatter.BASIC_ISO_DATE)
    .appendLiteral('Z')
    .append(DateTimeFormatter.ISO_LOCAL_TIME)
    .toFormatter()
    .parse("20180302Z17:45:21");

Now I got java.time.format.DateTimeParseException: Text '20180302Z17:45:21' could not be parsed at index 9. Index 9 us right after the Z, so it seems the formatter parses the offset and then tries to find the literal Z where the 17 is.

EDIT: And the solution? Instead of using BASIC_ISO_DATE append a pattern:

.appendPattern("uuuuMMdd")

Now parsing works also on Java 9.0.4.

EDIT: Further to illustrate the optionality of the offset:

System.out.println(
    LocalDate.now().format(DateTimeFormatter.BASIC_ISO_DATE)
);
System.out.println(
    OffsetDateTime.now().format(DateTimeFormatter.BASIC_ISO_DATE)
);

This printed

20180305
20180305+0100

So in the first case, where no offset is available, it just leaves it out. In the second case, where one is available, it is also printed (without colon).

Open question: Why does it work in Java 8? Is this really a bug?

Quote:

  • If the offset is not available to format or parse then the format is complete.
  • The offset ID without colons. If the offset has seconds then they will be handled even though this is not part of the ISO-8601 standard. The offset parsing is lenient, which allows the minutes and seconds to be optional. Parsing is case insensitive.

From the documentation of BASIC_ISO_DATE

Michael
  • 41,989
  • 11
  • 82
  • 128
Ole V.V.
  • 81,772
  • 15
  • 137
  • 161
  • So why does BASIC_ISO_DATE, which specifically says it is "without an offset", even include an offset ID? – Michael Mar 05 '18 at 12:18
  • 1
    Regarding your second open question: `BASIC_ISO_DATE` states that It will try to parse "*The offset ID without colons*" and it must be in the format `±hhmm` or `±hhmmss`. – MC Emperor Mar 05 '18 at 12:20
  • 1
    You are right, @Michael, that is funny. Strictly speaking it is correct since the offset ID is optional, but I might take it as misleading still. – Ole V.V. Mar 05 '18 at 12:20
  • Thanks, @MCEmperor. Next time I will read the quote I am pasting into the answer. :-) – Ole V.V. Mar 05 '18 at 12:24
  • @OleV.V. No problem. This is very interesting (and also difficult) matter, and without your question, I probably wouldn't have taken a further look at the documentation. – MC Emperor Mar 05 '18 at 12:28
  • 4
    Probably related to https://bugs.openjdk.java.net/browse/JDK-8066806 where more offset formats are handled in Java 9 as opposed to Java 8. – JodaStephen Mar 05 '18 at 12:41
1

I raised this as a bug and it's been confirmed in JDK-8199412.

Michael
  • 41,989
  • 11
  • 82
  • 128