NLP: Validate a sentence against a given grammar

Question

I have a corpus of English sentences

sentences = [
    "Mary had a little lamb.",
    "John has a cute black pup.",
    "I ate five apples."
]

and a grammar (for the sake of simplicity)

grammar = ('''
    NP: {<NNP><VBZ|VBD><DT><JJ>*<NN><.>} # NP
    ''')

I wish to filter out the sentences which don't conform to the grammar. Is there a built-in NLTK function which can achieve this? In the above example, first two sentences follow the pattern of my grammar, but not the last one.

If you're just extracting nouns, see https://stackoverflow.com/q/49564176/610569 — alvas, May 10 '19 at 08:55

score 1 · Answer 1 · answered May 10 '19 at 09:00

TL;DR

Write a grammar, check that it parses, iterate through the subtrees and look for the non-terminals you're looking for, e.g. NP

See:

Code:

import nltk

grammar = ('''
    NP: {<NNP><VBZ|VBD><DT><JJ>*<NN><.>} # NP
    ''')

sentences = [
    "Mary had a little lamb.",
    "John has a cute black pup.",
    "I ate five apples."
]

def has_noun_phrase(sentence):
    parsed = chunkParser.parse(pos_tag(word_tokenize(sentence)))
    for subtree in parsed:
        if type(subtree) == nltk.Tree and subtree.label() == 'NP':
            return True
    return False

chunkParser = nltk.RegexpParser(grammar)
for sentence in sentences:
    print(has_noun_phrase(sentence))

score 0 · Answer 2 · answered May 10 '19 at 01:36

0

NLTK supports POS tagging, you can firstly apply POS tagging to your sentences, and then compare with the pre-defined grammar. Below is an example of using NLTK POS tagging.

answered May 10 '19 at 01:36

Giang Nguyen

450
8
17

But that doesn't solve my problem. My grammar has a pre-defined structure and I wish to validate whether the grammar returned by nltk.pos_tag() is _similar_ or not. I could write my own parser to validate my regex grammar against the one returned but I'm looking for an inbuilt validator. – rocx May 10 '19 at 05:56
I don't know, maybe you need to do it my your own. Sorry. – Giang Nguyen May 10 '19 at 06:32

NLP: Validate a sentence against a given grammar

2 Answers2

TL;DR