5

I am researching some Natural Language Processing algorithms to read a piece of text, and if the text seems to be trying to suggest a meeting request, it sets up that meeting for you automatically.

For example, if an email text reads:

Let's meet tomorrow someplace in Downtown at 7pm".

The algorithm should be able to detect the Time, date and place of the event.

Does someone know of some already existing NLP algorithms that I could use for this purpose? I have been researching some NLP resources (like NLTK and some tools in R), but did not have much success.

Thanks

arturomp
  • 28,790
  • 10
  • 43
  • 72
Darth.Vader
  • 5,079
  • 7
  • 50
  • 90
  • 1
    Why did I get a "-2" for my question? When one downvotes a question, can they also tell us the right way to do things so that it facilitates learning? – Darth.Vader Sep 30 '13 at 18:58
  • Possible duplicate of http://stackoverflow.com/questions/9294926/how-does-apple-find-dates-times-and-addresses-in-emails – mbatchkarov Oct 01 '13 at 16:31
  • @Darth.Vader I didn't downvote, but nobody needs to inform. But let me just say, your post is just a resource request, which can get closed. – 10 Rep Aug 15 '20 at 23:06

4 Answers4

5

This is an application of information extraction, and can be solved more specifically with sequence segmentation algorithms like hidden Markov models (HMMs) or conditional random fields (CRFs).

For a software implementation, you might want to start with the MALLET toolkit from UMass-Amherst, it's a popular library that implements CRFs for information extraction.

You would treat each token in a sentence as something to be labeled with the fields you are interested in (or 'x' for none of the above), as a function of word features (like part of speech, capitalization, dictionary membership, etc.)... something like this:

token       label       features
-----------------------------------
Let         x           POS=NNP, capitalized
's          x           POS=POS
meet        x           POS=VBP
tomorrow    DATE        POS=NN, inDateDictionary
someplace   x           POS=NN
in          x           POS=IN
Downtown    LOCATION    POS=NN, capitalized
at          x           POS=IN
7pm         TIME        POS=CD, matchesTimeRegex
.           x           POS=.

You will need to provide some hand-labeled training data first, though.

burr
  • 529
  • 5
  • 8
2

You should have a look to http://opennlp.apache.org java toolkit

tom
  • 1,647
  • 14
  • 15
0

I think you should be able to do this with spacy. I tried this in jupyter-notebook

import spacy
nlp = spacy.load('en_core_web_sm')

doc = nlp(u'Over the last quarter in 2018-12-02 Apple sold nearly 20 thousand iPods for a profit of $6 million.')
displacy.render(doc, style='ent', jupyter=True)

Output

Over the last quarter DATE in 2018-12-02 DATE Apple ORG sold nearly 20 thousand CARDINAL iPods PRODUCT for a profit of $6 million MONEY .
0

This problem is still in the headlines today. If you are (still) looking for an algorithm, there are solutions using ANTLR a parser generator in a large choice of programming languages (C/C++/C#/JS/Java/...). Some open-source references:

peter.cyc
  • 1,763
  • 1
  • 12
  • 19