10

I'd love to know if there is a module to parse "human formatted" dates in Perl. I mean things like "tomorrow", "Tuesday", "next week", "1 hour ago".

My research with CPAN suggest that there is no such module, so how would you go about creating one? NLP is way over the top for this.

brian d foy
  • 129,424
  • 31
  • 207
  • 592
andymurd
  • 821
  • 8
  • 10

4 Answers4

23

Date::Manip does exactly this.

Here is an example program:

#!/usr/bin/perl

use strict;
use Date::Manip;

while (<DATA>)
{
  chomp;
  print UnixDate($_, "%Y-%m-%d %H:%M:%S"),  " ($_)\n";
}

__DATA__
today
yesterday
tomorrow
last Tuesday
next Tuesday
1 hour ago
next week

Which results in the following output:

2008-11-17 15:21:04 (today)
2008-11-16 15:21:04 (yesterday)
2008-11-18 15:21:04 (tomorrow)
2008-11-11 00:00:00 (last Tuesday)
2008-11-18 00:00:00 (next Tuesday)
2008-11-17 14:21:04 (1 hour ago)
2008-11-24 00:00:00 (next week)

UnixDate is one of the functions provided by Date::Manip, the first argument is a date/time in any format that the module supports, the second argument describes how to format the date/time. There are other functions that just parse these "human" dates, without formatting them, to be used in delta calculations, etc.

cjm
  • 61,471
  • 9
  • 126
  • 175
Robert Gamble
  • 106,424
  • 25
  • 145
  • 137
  • 1
    Ah, good old Date::Manip... How can you not love a module that tries so hard to talk you out of using it? – Dave Sherohman Nov 18 '08 at 01:53
  • Exactly what I was looking for, but (as usual) I didn't know how to phrase the question. Thanks. – andymurd Nov 18 '08 at 10:07
  • I think DateTime is what the community now pushes the most as a comprehensive Date & Time solution. Check out datetime.perl.org and this article http://www.perl.com/pub/a/2003/03/13/datetime.html – draegtun Nov 18 '08 at 10:58
9

you may also find it interesting to look at the DateTime::Format family, specifically DateTime::Format::Natural. once you've parsed your date/time into a DateTime object, you can manipulate and evaluate it in a whole bunch of different ways.

here's a sample program:

use strict;
use warnings;

use DateTime::Format::Natural;

my( $parser ) = DateTime::Format::Natural->new;

while ( <> ) {

    chomp;
    my( $dt ) = $parser->parse_datetime( $_ );

    if ( $parser->success ) {

        print join( ' ', $dt->ymd, $dt->hms ) . "\n";
    }
    else {

        print $parser->error . "\n";
    }
}

output:

tomorrow  
2008-11-18 21:48:49  
next Tuesday  
2008-11-25 21:48:53  
1 week from now  
2008-11-24 21:48:57  
1 hour ago  
2008-11-17 20:48:59  

TMTOWTDI :)

-steve

hakamadare
  • 1,524
  • 1
  • 11
  • 13
2

Personally, I've always used Time::ParseDate for this. It understands pretty much every format I've tried.

Absolute date formats

    Dow, dd Mon yy
    Dow, dd Mon yyyy
    Dow, dd Mon
    dd Mon yy
    dd Mon yyyy
    Month day{st,nd,rd,th}, year
    Month day{st,nd,rd,th}
    Mon dd yyyy
    yyyy/mm/dd
    yyyy-mm-dd      (usually the best date specification syntax)
    yyyy/mm
    mm/dd/yy
    mm/dd/yyyy
    mm/yy
    yy/mm      (only if year > 12, or > 31 if UK)
    yy/mm/dd   (only if year > 12 and day < 32, or year > 31 if UK)
    dd/mm/yy   (only if UK, or an invalid mm/dd/yy or yy/mm/dd)
    dd/mm/yyyy (only if UK, or an invalid mm/dd/yyyy)
    dd/mm      (only if UK, or an invalid mm/dd)

Relative date formats:

    count "days"
    count "weeks"
    count "months"
    count "years"
    Dow "after next"
    Dow "before last"
    Dow                     (requires PREFER_PAST or PREFER_FUTURE)
    "next" Dow
    "tomorrow"
    "today"
    "yesterday"
    "last" dow
    "last week"
    "now"
    "now" "+" count units
    "now" "-" count units
    "+" count units         
    "-" count units
    count units "ago"

Absolute time formats:

    hh:mm:ss[.ddd] 
    hh:mm 
    hh:mm[AP]M
    hh[AP]M
    hhmmss[[AP]M] 
    "noon"
    "midnight"

Relative time formats:

    count "minutes"         (count can be franctional "1.5" or "1 1/2")
    count "seconds"
    count "hours"
    "+" count units
    "+" count
    "-" count units
    "-" count
    count units "ago"

Timezone formats:

    [+-]dddd
    GMT[+-]d+
    [+-]dddd (TZN)
    TZN

Special formats:

    [ d]d/Mon/yyyy:hh:mm:ss [[+-]dddd]
    yy/mm/dd.hh:mm
cjm
  • 61,471
  • 9
  • 126
  • 175
-2

I assume you have context. how could NLP help here ? as a wild guess you could just find the nearest date that is an exact date(not relative to today) and use today/tommorow/yesterday to relate to that.

xxxxxxx
  • 5,037
  • 6
  • 28
  • 26