2

I'm reading a from a few RSS sites which don't send the typical:

  • iso representation
    • 2019-06-12T07:17:47Z - Instant.parse() can be used
  • RFC1123
    • Wed, 12 Jun 2019 03:17:47 -0400 - DateTimeFormatter.RFC_1123_DATE_TIME.parse() can be used

Instead I'm getting these strings:

  • Tue, 25 May 2021 00:00:00 EDT
  • 03 Jun 2021 18:35:00 HKT

I've already tried around with some custom patterns and the ZonedDateTime + OffsetDateTime parse() method. Although I haven't found a way to get a date time representation that I can convert into Instant. Neither do I control the source and can fix the output format.

How can I be more lenient and parse these date times?

Niklas
  • 23,674
  • 33
  • 131
  • 170
  • `Tue, 25 May 2021 00:00:00 EDT` really conforms to RFC 1123 too. Only unfortunately Java’s implementation is limited: *North American zone names … are not handled.* ([Doc](https://docs.oracle.com/javase/10/docs/api/java/time/format/DateTimeFormatter.html#RFC_1123_DATE_TIME)) – Ole V.V. Jun 06 '21 at 17:09
  • 1
    @OleV.V. Alternative: My lib [Time4J](http://time4j.net/javadoc-en/net/time4j/format/expert/ChronoFormatter.html#RFC_1123) handles North American zone names. – Meno Hochschild Jun 07 '21 at 10:50

1 Answers1

4

You can create a DateTimeFormatter with a custom pattern that has an optional day-of-week at the beginning. Afterwards, use the parse method of formatter with which you can specify the desired type of the parsed date-time directly (as per comment of Ole V.V.). Another approach is to first parse as ZonedDateTime and then convert to an Instant.

DateTimeFormatter formatter =
        DateTimeFormatter.ofPattern("[EEE, ]dd MMM yyyy HH:mm:ss zzz", Locale.ENGLISH);

String input1 = "Tue, 25 May 2021 00:00:00 EDT";
Instant instant1 = formatter.parse(input1, Instant::from);
// Instant instant1 = ZonedDateTime.parse(input1, formatter).toInstant();
System.out.println(instant1);

String input2 = "03 Jun 2021 18:35:00 HKT";
Instant instant2 = formatter.parse(input2, Instant::from);
// Instant instant2 = ZonedDateTime.parse(input2, formatter).toInstant();
System.out.println(instant2);

Output:

2021-05-25T04:00:00Z
2021-06-03T10:35:00Z
Matt
  • 12,848
  • 2
  • 31
  • 53
  • 2
    [Never use SimpleDateFormat or DateTimeFormatter without a Locale](https://stackoverflow.com/a/65544056/10819573) – Arvind Kumar Avinash Jun 06 '21 at 15:49
  • 2
    Good answer. Possible alternative: `Instant instant1 = formatter.parse(input1, Instant::from);` (same for `input2`). Feels a bit simpler to me, but it’s a matter of taste. Nice and clever use of optional part for the day of week (the square brackets)! – Ole V.V. Jun 06 '21 at 17:13
  • 2
    @OleV.V. Thanks for sharing your alternative which I personally find better since no additional conversion from `ZonedDateTime` to `Instant` is necessary. I updated my answer accordingly. – Matt Jun 06 '21 at 18:55
  • 1
    @Matt thanks a lot. Learned also about []. I've extended it to: `[EEE, ]dd MMM yyyy HH:mm:ss zzz[Z]` to be able to also parse dates such as `Mon, 22 Mar 2021 18:39:18 GMT+0100`. Thanks a lot! – Niklas Jun 07 '21 at 10:14