1

I want to use NLTK to find the most frequent nouns and the adjectives describing those nouns from texts and then finding synonyms of those nouns. I then want to cluster those documents.

import nltk

from nltk.tag import *

sentence = """Michale Scofield is an engineer. He is expert in java. He knows coding very well.
He think world is like a big system of computers.A big system of computer involves the servers, servers are nothing but super-fast machines"""

tagged_sent = pos_tag(sentence.split())
print(tagged_sent)

nouns = [word for word,pos in tagged_sent if pos == 'NNP' or pos=='NN']

print(nouns)

freq_nouns=nltk.FreqDist(nouns)

freq_nouns.most_common(3)

I have tried but was unable to get the nouns and the adjectives describing those nouns and it should be the most frequent ones.

for the above sentence i want the output to be

expert java, big system, super-fast machines

Can someone please help me in this.

erip
  • 16,374
  • 11
  • 66
  • 121
Atiya
  • 11
  • 1
  • 4
  • This is tricky because what happens if you say something like `big, red truck` or `My java is expert`? – erip Jan 10 '16 at 15:11
  • This problem has been described [here](http://stackoverflow.com/questions/32329039/get-corresponding-verbs-and-nouns-for-adverbs-and-adjectives). – erip Jan 10 '16 at 15:16
  • 1
    I agree with @erip, depending on the ultimate task, a dependency parse to detect noun and their adjunct/modifier might be more appropriate. What is the ultimate aim of extracting the adj+noun ? How does your data look like? Which domain, how specific is the domain? How simple is the english in the domain? What is the granularity of the input (sentence, paragraph, documents, phrase, clause)? – alvas Jan 10 '16 at 15:42
  • i too agree with @erip. Thank you for enlightening me now this – Atiya Jan 10 '16 at 20:27
  • @alvas i want to extract adj+noun from resumes and then will cluster resumes (expert in java and excellent in java resumes should come under one cluster) – Atiya Jan 10 '16 at 20:29
  • input will be resumes – Atiya Jan 10 '16 at 20:33
  • Did you get he proper way to do It with python? I am also working on similar problem – Nirali Khoda May 21 '18 at 10:43

0 Answers0