0

I am trying to count all hyponyms of a noun that does not have hyponyms itself (that are terminal in the hierarchy of nouns below that noun). For example, for ‘entity’ (the noun highest in the hierarchy), the results should be the count of all nouns that does not have hyponyms (all nouns that are terminal in the hierarchy). For noun that is terminal itself, the number have to be 1. I have a list of nouns. The output have to give such count for each noun in the list.

After a lot of searching here, trials and errors, this is the code I came up (only the relevant part):

import nltk
from nltk.corpus import wordnet as wn

def get_hyponyms(synset): #function source:https://stackoverflow.com/questions/15330725/how-to-get-all-the-hyponyms-of-a-word-synset-in-python-nltk-and-wordnet?rq=1
    hyponyms = set()
    for hyponym in synset.hyponyms():
        hyponyms |= set(get_hyponyms(hyponym))
    return hyponyms | set(synset.hyponyms())

with open("list-nouns.txt", "rU") as wordList1:
    myList1 = [line.rstrip('\n') for line in wordList1]
    for word1 in myList1:
        list1 = wn.synsets(word1, pos='n')
        countTerminalWord1 = 0  #counter for synsets without hyponyms
        countHyponymsWord1 = 0  #counter for synsets with hyponyms
        for syn_set1 in list1:
            syn_set11a = get_hyponyms(syn_set1)
            n = len(get_hyponyms(syn_set1))  #number of hyponyms
            if n > 0:
                countHyponymsWord1 += n
            else:
                countTerminalWord1 += 1
            for syn_set11 in syn_set11a:
                syn_set111a = get_hyponyms(syn_set11)
                n = len(get_hyponyms(syn_set11))
                if n > 0:
                    countHyponymsWord1 += n
                else: 
                    countTerminalWord1 += 1
                #...further iterates in the same way for the following levels
        print (countHyponymsWord1)
        print (countTerminalWord1)

(The code also tries to calculate all nouns that does have hyponyms, but this is not essential).

The main problem is that I cannot repeat this code for the whole depth of the noun hierarchy of 19 steps. It soon gives ‘SystemError: too many statically nested blocks’.

Help or advice how to solve this will be greatly appreciated.

Georgi
  • 11
  • 4
  • 1
    to clarify, if we look at this [image of hypernym/hyponym](https://en.wikipedia.org/wiki/Hyponymy_and_hypernymy#/media/File:Hyponymsandhypernyms.jpg) and we use color as an example, you would want a count of 3 (excluding purple?) – Nathan McCoy Jan 10 '17 at 15:42
  • In this image of hypernym/hyponym, for 'color' the output should be 6 (3 terminal hyponyms for 'color' plus 3 terminal hyponyms for 'purple' below). For 'purple' should be 3. For 'lavender' should be 1. Thanks! – Georgi Jan 11 '17 at 06:49
  • There is also another problem with code above - it counts hyponyms of synsets instead of hyponyms of words. @Nathan McCoy – Georgi Jan 13 '17 at 11:03
  • Are you sure about the "too many statically nested blocks" exception? Because I don't see deeply nested blocks in your code. I'd rather expect something related to heavy recursion or running out of memory. – lenz Jan 13 '17 at 22:13
  • @lenz Thanks for bringing this! Actually, "too many statically nested blocks" appears when the code is 19 levels deep and disappears when it is reduced to 18 levels. As you point out, code is very heavy. It takes about 40 minutes for single word 'entity' and 18 levels, and result is incorrect. – Georgi Jan 18 '17 at 09:54
  • Well, you shouldn't repeat the code for 19 levels! That's why the `get_hyponyms()` function is recursive. – lenz Jan 18 '17 at 10:04

0 Answers0