7

I've a requirement where date can be passed in the following formats before indexing them to Solr. Here are the examples of dates being passed

String dateStr = "2012-05-23T00:00:00-0400";
String dateStr1 = "May 24, 2012 04:57:40 GMT";
String dateStr2 = "2011-06-21";
    
The standard Solr format is "yyyy-MM-dd'T'HH:mm:ss'Z'".

I've tried SimpleDateFormat but is not able to write a generic program to support various formats. It ends up throwing parse exceptions.

I also tried joda time, but not been succeful so far in UTC conversion.

public static String toUtcDate(final String iso8601) {
        DateTime dt = ISO_PARSE_FORMAT.parseDateTime(iso8601);
        DateTime utcDt = dt.withZone(ZONE_UTC);
        return utcDt.toString(ISO_PRINT_FORMAT);
    }

Is there a standard library to achieve this ?

Any pointers will be appreciated.

Thanks

Shamik
  • 1,671
  • 11
  • 36
  • 64

2 Answers2

15

I just try the various formats until I get a hit:

public static String toUtcDate(String dateStr) {
    SimpleDateFormat out = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss'Z'");
    // Add other parsing formats to try as you like:
    String[] dateFormats = {"yyyy-MM-dd", "MMM dd, yyyy hh:mm:ss Z"}; 
    for (String dateFormat : dateFormats) {
        try {
            return out.format(new SimpleDateFormat(dateFormat).parse(dateStr));
        } catch (ParseException ignore) { }
    }
    throw new IllegalArgumentException("Invalid date: " + dateStr);
}

I'm not aware of a library that does this.

Bohemian
  • 412,405
  • 93
  • 575
  • 722
  • Thanks for your reply. I was trying to avoid this solution and looking if there's generic one available, which can cater to any date format, as long as its valid. The problem with this approach is everytime there's a new format, the code breaks unless we add it in the array. I was hoping if there's any available tool which already caters to this. – Shamik May 26 '12 at 00:05
  • I once wrote something like this, and I could only think of about 1 dozen or so - people don't use that many different date formats actually. The main problem was deciding if `01-02-2012` is Jan or Feb, because it can be parsed OK with both ISO `dd-MM-yyyy` *and* the stupid american format `MM-dd-yyyy`. – Bohemian May 26 '12 at 00:08
  • The American format is easy to visually parse. And calling it "stupid" is no way to answer for a lack of proper i18n. – Joe Coder Jan 11 '13 at 02:17
  • @JoeCoder I think "stupid" is appropriate, because it makes no sense. Either go most-to-least significant, or visa versa, but not some jumble like the US format. This opinion is further reinforced by noting that **The USA is the only country IN THE WORLD that uses "US" format month-day-year**. I assume you are american - saying it is "easy to visually parse" could only be the opinion of someone who is used to using it... the other 5 billion non-US people on the planet would disagree, because it is in fact anything *but* easy to parse. – Bohemian Jan 11 '13 at 04:40
  • @Bohemian Pretty insulting, you suggest that my context precludes any rational analysis, but not you, and you defend it with an absurd ad populum fallacy, composed entirely of people outside the culture that derived the syntax. What is the purpose of language? Why do we bother with i18n? Why don't you take foreign syntax for granted? If it's useful, fast to parse, and semantically equivalent, wouldn't it be "stupid" NOT to use it? – Joe Coder Jan 11 '13 at 06:09
  • As a native of the culture of the language of this particular lexical expression, it's VERY FAST for me to parse. But you don't believe that, hah. That's the really snotty part about your paragraph, is that you are implying that American culture is some sort of subset of yours. I think vegemite tastes like shit, but you don't see me saying "vegemite is stupid". – Joe Coder Jan 11 '13 at 06:17
0

Here is the answer: Converting ISO 8601-compliant String to java.util.Date

Once you have your Date, you know how to get your UTC time.


Edit: The accepted answer doesn't use joda time but jaxb.

By the way, where do these formats come from?

String dateStr = "2012-05-23T00:00:00-0400";
String dateStr1 = "May 24, 2012 04:57:40 GMT";
String dateStr2 = "2011-06-21";

If they are different from a locale to another, it may be possible they were generated by DateFormat.getDateTimeInstance(...,...) so perhaps try to figure out which has been used.

Community
  • 1
  • 1
Sebastien Lorber
  • 89,644
  • 67
  • 288
  • 419
  • Thanks for your reply, but I don't think Joda time is able to handle this format "May 24, 2012 04:57:40 GMT" unless you explicitly format it. – Shamik May 26 '12 at 00:18
  • I don't have control over the dates, they are being crawled from different source and are being indexed. – Shamik May 26 '12 at 16:22