13

I need to parse (German) dates that come in the following form:

10. Jan. 18:14
8. Feb. 19:02
1. Mär. 19:40
4. Apr. 18:55
2. Mai 21:55
5. Juni 08:25
5. Juli 20:09
1. Aug. 13:42
[...]

As you can see, the month names are cut if the month has more than 4 characters. Even weirder, don't aks me why, the month of March is shortened to Mär. although the whole name is März. How can I parse this with java.time? (The dates are formatted based on the localization of the android device that creates the list of dates. However, I'm not parsing it on Android)

My approach was to create a DateTimeFormatter like this:

DateTimeFormatter.ofPattern("d. MMMM HH:mm").withLocale(Locale.GERMAN);
// or
DateTimeFormatter.ofPattern("d. MMMMM HH:mm").withLocale(Locale.GERMAN);

But neither the MMMM nor the MMMMM pattern fit the dates that are shortened. I can, of course, have the following pattern d. MMM. HH:mm to match the shortened months, but then I can't match the 3 and 4 characters months. I am aware that I can have two formatters (MMM. and MMMMM) but I would rather have a solution where I have only one formatter and possibly a custom locale or something like this.

assylias
  • 321,522
  • 82
  • 660
  • 783
rob
  • 2,904
  • 5
  • 25
  • 38
  • If you have control over that Android app, it would be a lot better to have it send the dates in a standard format. Localized formats should really only be used for user interaction, not for data exchange. If not, I think you probably should just remove the character before the space before parsing. – RealSkeptic Jun 13 '15 at 07:30
  • Unfortunately I don't have control over the Android app, otherwise I'd transmit the data in a structured form and use a unix time stamp for dates :-). Just removing the character befor the space, i.e. the dot, does not help because it still leaves me in a mix between shortened month names and full names. – rob Jun 13 '15 at 07:36
  • No, if you remove the character before the space it also shortens Juli and Juni to Jul and Jun. – RealSkeptic Jun 13 '15 at 07:41
  • And Mai becomes Ma. The correct short month for März is Mrz. Because there will be other localizations than German I'd prefer an approach that does not require changes in the code for special cases. – rob Jun 13 '15 at 07:46
  • Ah, I missed that. In that case, I think your best option is to create a method that will simply replace the non-standard month name with the standard one before parsing. – RealSkeptic Jun 13 '15 at 07:55

4 Answers4

12

The answer to the problem is the DateTimeFormatterBuilder class and the appendText(TemporalField, Map) method. It allows any text to be associated with a value when formatting or parsing, which solves the problem effectively and elegantly:

Map<Long, String> monthNameMap = new HashMap<>();
monthNameMap.put(1L, "Jan.");
monthNameMap.put(2L, "Feb.");
monthNameMap.put(3L, "Mar.");
DateTimeFormatter fmt = new DateTimeFormatterBuilder()
    .appendPattern("d. ")
    .appendText(ChronoField.MONTH_OF_YEAR, monthNameMap)
    .appendPattern(" HH:mm")
    .parseDefaulting(ChronoField.YEAR, 2016)
    .toFormatter();

System.out.println(LocalDateTime.parse("10. Jan. 18:14", fmt));
System.out.println(LocalDateTime.parse("8. Feb. 19:02", fmt));

Some notes:

  • The monthNameMap must be populated with all 12 months
  • The formatter should normally be assigned to a static final constant, rather than being created all the time
  • The parseDefaulting(YEAR, 2016) has been added so that LocalDateTime.parse(String, DateTimeFormatter) can be used directly. Without it, there would be no year, and thus nothing more than a TemporalAccessor could be parsed (the year must be a leap year, in case 29th Feb is being parsed)
Nand
  • 568
  • 3
  • 18
JodaStephen
  • 60,927
  • 15
  • 95
  • 117
  • This is exactly what I was looking for, thank you. This way I can store the content of the map somewhere else and can easily add new cases that deviate from standard locales. – rob Jun 13 '15 at 15:46
6

You could use a DateTimeFormatterBuilder:

private static final DateTimeFormatter formatter = new DateTimeFormatterBuilder()
            .appendOptional(DateTimeFormatter.ofPattern("d. MMM. HH:ss"))
            .appendOptional(DateTimeFormatter.ofPattern("d. MMMM HH:ss"))
            .toFormatter(Locale.GERMAN);

Running it on this:

Stream.of(("10. Jan. 18:14\n" +
           "8. Feb. 19:02\n" +
           "1. Mär. 19:40\n" +
           "4. Apr. 18:55\n" +
           "2. Mai 21:55\n" +
           "5. Juni 08:25\n" +
           "5. Juli 20:09\n" +
           "1. Aug. 13:42").split("\n"))
       .map(formatter::parse)
       .forEach(System.out::println);

you get:

{NanoOfSecond=0, MicroOfSecond=0, DayOfMonth=10, MonthOfYear=1, MilliOfSecond=0, SecondOfMinute=14, HourOfDay=18},ISO
{NanoOfSecond=0, MicroOfSecond=0, DayOfMonth=8, MonthOfYear=2, MilliOfSecond=0, SecondOfMinute=2, HourOfDay=19},ISO
{NanoOfSecond=0, MicroOfSecond=0, DayOfMonth=1, MonthOfYear=3, MilliOfSecond=0, SecondOfMinute=40, HourOfDay=19},ISO
{NanoOfSecond=0, MicroOfSecond=0, DayOfMonth=4, MonthOfYear=4, MilliOfSecond=0, SecondOfMinute=55, HourOfDay=18},ISO
{NanoOfSecond=0, MicroOfSecond=0, DayOfMonth=2, MonthOfYear=5, MilliOfSecond=0, SecondOfMinute=55, HourOfDay=21},ISO
{NanoOfSecond=0, MicroOfSecond=0, DayOfMonth=5, MonthOfYear=6, MilliOfSecond=0, SecondOfMinute=25, HourOfDay=8},ISO
{NanoOfSecond=0, MicroOfSecond=0, DayOfMonth=5, MonthOfYear=7, MilliOfSecond=0, SecondOfMinute=9, HourOfDay=20},ISO
{NanoOfSecond=0, MicroOfSecond=0, DayOfMonth=1, MonthOfYear=8, MilliOfSecond=0, SecondOfMinute=42, HourOfDay=13},ISO
user2336315
  • 15,697
  • 10
  • 46
  • 64
  • 2
    Elegant. I believe that `DateTimeFormatter.ofPattern("d. [MMM.][MMMM] HH:ss", Locale.GERMAN)` would do. The square brackets denote optional parts. – Ole V.V. Dec 12 '17 at 16:28
2

As pointed out it would be easier to use a standard and consistent format - here you are mixing long and short month names.

One option (short of using a DateTimeFormatterBuilder) is to handle both cases separately:

private static final DateTimeFormatter SHORT_MONTH = DateTimeFormatter.ofPattern("d. MMM. HH:ss", Locale.GERMAN);
private static final DateTimeFormatter LONG_MONTH = DateTimeFormatter.ofPattern("d. MMMM HH:ss", Locale.GERMAN);
private static TemporalAccessor parse(String s) {
  try {
    return SHORT_MONTH.parse(s);
  } catch (DateTimeParseException e) {
    return LONG_MONTH.parse(s);
  }
}
assylias
  • 321,522
  • 82
  • 660
  • 783
  • I'm aware of this solution as I wrote in the question. I'm looking at the `DateTimeFormatterBuilder` right now. How would you use it to achieve this? – rob Jun 13 '15 at 07:56
1

You can regex replace the month portion so it's always 3 characters length before parsing it using "d. MMM HH:mm"

text = text.replaceFirst("(\\S+\\s\\S{3})\\S", "$1")

Explanation for the regex part: Find 1 or more non-whitespace (\S+) followed by 1 whitespace (\s) followed by three non-whitespace (\S{3}) followed by one non-whitespace, and replace it with the portion inside first bracket ($1)

10. Jan. 18:14 will become 10. Jan 18:14 and 5. Juni 08:25 will become 5. Jun 08:25

gerrytan
  • 40,313
  • 9
  • 84
  • 99