2

i'm using SUutime / stanford nlp, and it's doing a great job, but i can't figure out how to read regular dates formats.

for instance:

'we went at 27/10/1988 to the event'

it returns null

for expression like: 'we went at october 27th 1988 to the event', it works just fine

any ideas?

cheers

pelumi
  • 1,530
  • 12
  • 21
gCoh
  • 2,719
  • 1
  • 22
  • 46
  • Are you sure it's not expecting US formats by default? Did you try with `10/27/1988`? – Blacksad May 27 '14 at 13:10
  • yes you are right. now i'm looking the way changing the input format. do you have any idea how it can be done? – gCoh May 28 '14 at 10:03

3 Answers3

3

I am not experiences with Stanford temporal package, but it is probably not tuned for that temporal format.

Something that I suggest you take a look is this: http://cogcomp.cs.illinois.edu/page/software_view/IllinoisTemporalExtractor

Which essentially works based on HeidelTime: https://code.google.com/p/heideltime/

Daniel
  • 5,839
  • 9
  • 46
  • 85
1

ok everyone, i think i got it.

in the sutime/english.sutime.txt line 319, there are few patterns for US tagging:

{ ruleType: "time", pattern: /yyyy-?MM-?dd-?'T'HH(:?mm(:?ss([.,]S{1,3})?)?)?(Z)?/ }
{ ruleType: "time", pattern: /yyyy-MM-dd/ }
{ ruleType: "time", pattern: /'T'HH(:?mm(:?ss(.,)?)?)?(Z)?/ }
# Tokenizer "sometimes adds extra slash
{ ruleType: "time", pattern: /yyyy\?/MM\?/dd/ }
{ ruleType: "time", pattern: /MM?\?/dd?\?/(yyyy|yy)/ }
{ ruleType: "time", pattern: /MM?-dd?-(yyyy|yy)/ }
{ ruleType: "time", pattern: /HH?:mm(:ss)?/ }
{ ruleType: "time", pattern: /yyyy-MM/ }

just need to add few ruleTypes, to get it the needed order

gCoh
  • 2,719
  • 1
  • 22
  • 46
1

I'll put this here incase someone finds it useful.

The problem is that some time formats are not supported.

Taking a look at the sutime/english.sutime.txt file, you'll see a line like those seen below. The TODO there shows other formats can still be added. I added 2 others to mine as seen below:

  # TODO: Support other timezone formats
  { ruleType: "time", pattern: /yyyy-?MM-?dd-?'T'HH(:?mm(:?ss([.,]S{1,3})?)?)?(Z)?/ }
  { ruleType: "time", pattern: /yyyy-MM-dd/ }
  { ruleType: "time", pattern: /'T'HH(:?mm(:?ss([.,](S{1,3}))?)?)?(Z)?/ }
  #The entries below are newly added to support other time formats.
  { ruleType: "time", pattern: /dd\/MM\/yyyy/ }
  { ruleType: "time", pattern: /dd-MM-yyyy/ }

The newly added entries enable SUTime to correctly identify time formats of the form:

20-12-2014 or 28/12/2014

which is identical to the OPs required form.

pelumi
  • 1,530
  • 12
  • 21