0

What's the best way in Java to parse a String into a Date that can be in any valid ISO 8601 format or in Unix epoch milliseconds? For example, it needs to be able to parse the following (all of which are either valid ISO 8601 or Unix epoch milliseconds):

  • 1534251817666
  • 2017-01-01
  • 2017-01-01T00
  • 2017-01-01T00:03
  • 2017-01-01T00:03,5
  • 2017-01-01T00:03.5
  • 2017-01-01T03:03:00+00:00
  • 2017-01-01T03:03:00-05:00
  • 2017-01-01T03:03:00+0500
  • 2017-01-01T03:03:00Z
  • 20170101T030300Z
  • 2017-W01-1
  • 2017W011
  • 2017-001
  • 2017001

I've found that the following code can resolve most of the cases, but not all, since none of the java.time provided DateTimeFormatters can handle all ISO 8601 cases:

try {
    return Date.from(Instant.ofEpochMilli(Long.parseLong(time)));
} catch (NumberFormatException e) {
    return Date.from(Instant.parse(time));
}
VGR
  • 40,506
  • 4
  • 48
  • 63
gbear605
  • 89
  • 1
  • 10
  • Figure out which cases it can not resolve and use string comprehension to alter them into a case for which it can resolve prior to feeding it to the try/catch. – Rob Aug 14 '18 at 15:40
  • 1
    Would that not think 2017001 being a epoch milliseconds? So a long value must be rangechecked. Formatters are easy to make. – Joop Eggen Aug 14 '18 at 15:42
  • Possible duplicate of [How to parse dates in multiple formats using SimpleDateFormat](https://stackoverflow.com/questions/4024544/how-to-parse-dates-in-multiple-formats-using-simpledateformat). Search for more similar questions. – Ole V.V. Aug 14 '18 at 15:43
  • 2
    @OleV.V. though SimpleDateFormat belongs to the old time classes. – Joop Eggen Aug 14 '18 at 15:45
  • 1
    It certainly does, @JoopEggen, I should of course have warned against it. However the linked question also has two java.time answers, [here](https://stackoverflow.com/a/39754641/5772882) and [here](https://stackoverflow.com/a/45315872/5772882). I recommend looking into those. – Ole V.V. Aug 14 '18 at 15:47
  • `2017-01-01T00:03,5` is not a valid ISO 8601 string. ISO decimal point is `.`, not `,`. --- `2017-01-01T00:03.5` is not a valid ISO 8601 string. Decimal point is for fractional seconds, not fractional minutes. – Andreas Aug 14 '18 at 16:57
  • 2
    @Andreas According to [Wikipedia](https://en.wikipedia.org/wiki/ISO_8601#Times) “A decimal mark, either a comma or a dot (without any preference as stated in resolution 10 of the 22nd General Conference CGPM in 2003,[24] but with a preference for a comma according to ISO 8601:2004)[25] is used as a separator between the time element and its fraction. To denote "14 hours, 30 and one half minutes", … Represent it as "14:30,5", "1430,5", "14:30.5", or "1430.5". ” – Ole V.V. Aug 14 '18 at 18:21
  • @OleV.V. I stand corrected. Thank you. – Andreas Aug 14 '18 at 18:24

2 Answers2

0

One way to parse all those formats is to write a regex, then create appropriate Temporal object from the parsed values.

private static Temporal parse(String text) {
    String regex = "(?:" +
                      "(\\d{9,})" +        // 1: millis
                   "|" +
                      "(\\d{4})" +         // 2: year
                      "(?:" +
                         "-?(\\d{3})" +    // 3: day-of-year
                      "|" +
                         "(-?)W(\\d{2})" + // 5: week-of-year
                         "(?:\\4(\\d))?" + // 6: day-of-week (optional)
                      "|" +
                         "(-?)(\\d{2})" +  // 8: month-of-year
                         "\\7(\\d{2})" +   // 9: day-of-month
                      ")" +
                      "(?:T(\\d{2})" +             // 10: hour (optional)
                        "(?:(:?)(\\d{2})" +        // 12: minute (optional)
                          "(?:\\11(\\d{2})" +      // 13: second (optional)
                            "(?:\\.(\\d{1,9}))?" + // 14: fractional (optional)
                          ")?" +
                        ")?" +
                        "(?:" +
                          "(Z)" +          // 15: Zulu
                        "|" +
                          "([+-]\\d{2})" + // 16: Offset hour (signed)
                          ":?(\\d{2})" +   // 17: Offset minute
                        ")?" +
                      ")?" +
                   ")";
    Matcher m = Pattern.compile(regex).matcher(text);
    if (! m.matches())
        throw new DateTimeParseException("Invalid date string", text, 0);

    // Handle millis
    if (m.start(1) != -1)
        return Instant.ofEpochMilli(Long.parseLong(m.group(1)));

    // Parse local date
    LocalDate localDate;
    if (m.start(3) != -1)
        localDate = LocalDate.ofYearDay(Integer.parseInt(m.group(2)), Integer.parseInt(m.group(3)));
    else if (m.start(5) != -1)
        localDate = LocalDate.parse(m.group(2) + "-W" + m.group(5) + "-" + (m.start(6) == -1 ? "1" : m.group(6)),
                                    DateTimeFormatter.ISO_WEEK_DATE);
    else
        localDate = LocalDate.of(Integer.parseInt(m.group(2)), Integer.parseInt(m.group(8)), Integer.parseInt(m.group(9)));
    if (m.start(10) == -1)
        return localDate;

    // Parse local time
    int hour   = Integer.parseInt(m.group(10));
    int minute = (m.start(12) == -1 ? 0 : Integer.parseInt(m.group(12)));
    int second = (m.start(13) == -1 ? 0 : Integer.parseInt(m.group(13)));
    int nano   = (m.start(14) == -1 ? 0 : Integer.parseInt((m.group(14) + "00000000").substring(0, 9)));
    LocalTime localTime = LocalTime.of(hour, minute, second, nano);

    // Return date/time
    if (m.start(15) != -1)
        return ZonedDateTime.of(localDate, localTime, ZoneOffset.UTC);
    if (m.start(16) == -1)
        return LocalDateTime.of(localDate, localTime);
    ZoneOffset zone = ZoneOffset.ofHoursMinutes(Integer.parseInt(m.group(16)), Integer.parseInt(m.group(17)));
    return ZonedDateTime.of(localDate, localTime, zone);
}

Test

public static void main(String[] args) {
    test("1534251817666");
    test("2017-01-01");
    test("2017-01-01T00");
    test("2017-01-01T00:03");
    test("2017-01-01T00:03:00.5"); // modified
    test("2017-01-01T03:03:00+00:00");
    test("2017-01-01T03:03:00-05:00");
    test("2017-01-01T03:03:00+0500");
    test("2017-01-01T03:03:00Z");
    test("20170101T030300Z");
    test("2017-W01-1");
    test("2017W011");
    test("2017-001");
    test("2017001");
}
private static void test(String text) {
    Temporal parsed = parse(text);
    System.out.printf("%-25s -> %-25s %s%n", text, parsed, parsed.getClass().getSimpleName());
}

Output

1534251817666             -> 2018-08-14T13:03:37.666Z  Instant
2017-01-01                -> 2017-01-01                LocalDate
2017-01-01T00             -> 2017-01-01T00:00          LocalDateTime
2017-01-01T00:03          -> 2017-01-01T00:03          LocalDateTime
2017-01-01T00:03:00.5     -> 2017-01-01T00:03:00.500   LocalDateTime
2017-01-01T03:03:00+00:00 -> 2017-01-01T03:03Z         ZonedDateTime
2017-01-01T03:03:00-05:00 -> 2017-01-01T03:03-05:00    ZonedDateTime
2017-01-01T03:03:00+0500  -> 2017-01-01T03:03+05:00    ZonedDateTime
2017-01-01T03:03:00Z      -> 2017-01-01T03:03Z         ZonedDateTime
20170101T030300Z          -> 2017-01-01T03:03Z         ZonedDateTime
2017-W01-1                -> 2017-01-02                LocalDate
2017W011                  -> 2017-01-02                LocalDate
2017-001                  -> 2017-01-01                LocalDate
2017001                   -> 2017-01-01                LocalDate

You can of course choose to always return a ZonedDateTime, using JVM default time zone when zone is not given, replacing statements as follows:

private static Temporal parse(String text) {
private static ZonedDateTime parse(String text) {

return Instant.ofEpochMilli(Long.parseLong(m.group(1)));
return Instant.ofEpochMilli(Long.parseLong(m.group(1))).atZone(ZoneOffset.UTC);

return localDate;
return ZonedDateTime.of(localDate, LocalTime.MIDNIGHT, ZoneId.systemDefault());

return LocalDateTime.of(localDate, localTime);
return ZonedDateTime.of(localDate, localTime, ZoneId.systemDefault());

Test

private static void test(String text) {
    System.out.printf("%-25s -> %s%n", text, parse(text));
}

Output

1534251817666             -> 2018-08-14T13:03:37.666Z
2017-01-01                -> 2017-01-01T00:00-05:00[America/New_York]
2017-01-01T00             -> 2017-01-01T00:00-05:00[America/New_York]
2017-01-01T00:03          -> 2017-01-01T00:03-05:00[America/New_York]
2017-01-01T00:03:00.5     -> 2017-01-01T00:03:00.500-05:00[America/New_York]
2017-01-01T03:03:00+00:00 -> 2017-01-01T03:03Z
2017-01-01T03:03:00-05:00 -> 2017-01-01T03:03-05:00
2017-01-01T03:03:00+0500  -> 2017-01-01T03:03+05:00
2017-01-01T03:03:00Z      -> 2017-01-01T03:03Z
20170101T030300Z          -> 2017-01-01T03:03Z
2017-W01-1                -> 2017-01-02T00:00-05:00[America/New_York]
2017W011                  -> 2017-01-02T00:00-05:00[America/New_York]
2017-001                  -> 2017-01-01T00:00-05:00[America/New_York]
2017001                   -> 2017-01-01T00:00-05:00[America/New_York]
Andreas
  • 154,647
  • 11
  • 152
  • 247
-2

I had a similar task. I wrote a utility that dealt with this issue. Unfortunately I don't have the utility itself, but I wrote an article that describes the idea for the solution. Here is the link to the article: Java 8 java.time package: parsing any string to date. Despite the title the idea could be implemented in versions earlier then 8 as well. Basically, the idea is to place all possible formats in a configuration file and attempt to parse your String one by one until you succeed. The order of the formats is important since sometimes a String could be successfully parsed by different formats and result in different Date values. SO place more important formats first. Read the article for details

Michael Gantman
  • 7,315
  • 2
  • 19
  • 36