Extracting Prepositional Phrases from Sentence

Question

I'm trying to extract prepositional phrases from sentences using NLTK. Is there a way for me to do this automatically (e.g. feed a function a sentence and get back its prepositional phrases)?

The examples here seem to require that you start with a grammar before you can get a parse tree. Can I automatically get the grammar and use that to get the parse tree?

Obviously I could tag a sentence, pick out prepositions and the subsequent noun, but this is complicated when the prepositional complement is compound.

Maybe this post will help http://stackoverflow.com/questions/6115677/english-grammar-for-parsing-in-nltk — NLPer, Jul 25 '13 at 18:13

score 2 · Accepted Answer · edited Jan 07 '21 at 22:40

2

What you really want to do is to fully parse your sentence with a robust statistical parser (e.g. like Stanford), and then look for constituents marked with PP:

(ROOT
  (S
    (NP (NNP John))
    (VP (VBZ lives)
      (PP (IN in)
        (NP (DT a) (NN house)))
      (PP (IN by)
        (NP (DT the) (NN sea))))))

I am not sure about the parsing abilities of NLTK and how accurate is the parsing if this feature exists, but it's not much of a problem to call an external parser from Python and then process the output. Using a parser will save you much time and effort (since the parser takes care of everything), and is the only reliable way to do this job.

edited Jan 07 '21 at 22:40

kitaird

23
1
8

answered Jul 25 '13 at 21:19

dkar

2,113
19
29

Obviously a full parse is an overkill, but it would get to the end goal. I'll give it a shot. Looks like there is [at least one](http://projects.csail.mit.edu/spatial/Stanford_Parser) Python interface to the Stanford parser. – Tim Hopper Jul 26 '13 at 12:34
1

I wouldn't say an overkill but a necessary complication. If you try to build a rule-based PP-recognizer, you will end up spending a lot of time and effort for mediocre results. – dkar Jul 26 '13 at 13:28

score 1 · Answer 2 · answered Mar 22 '16 at 23:52

1

I know the answer was already accepted, but a shallow parser will return the NLP chunks with minimal syntactic structure. This fairly linear result may be easier to work with. Here's an online demo of the CLiPS parser: http://www.clips.ua.ac.be/cgi-bin/webdemo/MBSP-instant-webdemo.cgi

Here's an example:

John gave the book to Mary

The [PNP] is easy to extract.

answered Mar 22 '16 at 23:52

Victor Stoddard

3,582
2
27
27

1

I tested this against multiple types of datasets seems to perform better in the extraction of NPs and PNPs - especially for biomedical text. – kaulmonish Dec 11 '17 at 07:07

Extracting Prepositional Phrases from Sentence

2 Answers2