3

I have 2 words, let's say computer and tool. Computer is a concrete noun whereas tool is relatively abstract. I want to get level of abstractness of each word that will reflect this. I thought the best way to do it is by counting number of hyper/hypo nyms for each word.

  1. Is it possible?
  2. Is there a better way to do it?
Sunderam Dubey
  • 1
  • 11
  • 20
  • 40
Cranjis
  • 1,590
  • 8
  • 31
  • 64

1 Answers1

2

The first problem is which meaning of computer would you refer to?

In WordNet, a word has different "concepts", aka synsets:

>>> from nltk.corpus import wordnet as wn

>>> wn.synsets('computer')
[Synset('computer.n.01'), Synset('calculator.n.01')]

>>> wn.synsets('computer')[0].definition()
'a machine for performing calculations automatically'
>>> wn.synsets('computer')[1].definition()
'an expert at calculation (or at operating calculating machines)'

And hyper/hyponyms are not connected to the word computer

The hyper/hyponyms are concepts, i.e. synsets too, so it's not connected to the form/word but to the possible synsets that might be represented by the word computer, i.e.

>>> type(wn.synsets('computer')[0])
<class 'nltk.corpus.reader.wordnet.Synset'>

>>> wn.synsets('computer')[0].hypernyms()
[Synset('machine.n.01')]

>>> wn.synsets('computer')[0].hyponyms()
[Synset('analog_computer.n.01'), Synset('digital_computer.n.01'), Synset('home_computer.n.01'), Synset('node.n.08'), Synset('number_cruncher.n.02'), Synset('pari-mutuel_machine.n.01'), Synset('predictor.n.03'), Synset('server.n.03'), Synset('turing_machine.n.01'), Synset('web_site.n.01')]

Yes that's a lot of information but how do I get hyper/hyponyms for words?

According to the definition, should words have hyper/hyponyms? Or should concept have hypo/hypernyms?

Fine, you're bringing me in circles... Just tell me how to use hyper-/hyponyms to see if a word is more abstract than another word!!!

Okay, then we have to make some assumption.

  1. Lets consider all synsets of a word accessed through the WordNet as a "holistic" concept of any word form

  2. We consider the sum of all DIRECT hyper-/hyponyms of all synsets of a given word

  3. Based on the number of hyper-/hyponyms of all synsets that can be represented by a certain word form, we deduce that word X is more/less abstract than word Y

But how to do (1), (2) and (3) in the code?

>>> hypernym_count = lambda word: sum(len(ss.hypernyms()) for ss in wn.synsets(word)) 
>>> hyponym_count = lambda word: sum(len(ss.hyponyms()) for ss in wn.synsets(word)) 

>>> hyponym_count('computer')
14
>>> hypernym_count('computer')
2


>>> hypernym_count('tool')
8
>>> hyponym_count('tool')
32

Since (3) is your hypothesis that you want to test, you should be the one deciding what heuristics to deduce if a word is more/less abstract based on the hyponym_count and hypernym_count results

Wait a minute, what's DIRECT hyper-/hyponyms?

We're only accessing the hyper-/hyponyms one level above/below the synset. That's what "direct" means here.

Then how to get all the hyponyms below a synset, see https://stackoverflow.com/a/42012001/610569

So should I use direct or all hyponyms below or all all hypernyms above?

That's for you to find out and tell us =) Have fun!

alvas
  • 115,346
  • 109
  • 446
  • 738
  • 1
    Thank for the great answer! can you please elaborate some more about what "direct" hyper/hyponyms means, and what are indirect? – Cranjis May 12 '20 at 09:19
  • 1
    Think of these 3 level concepts: `Vehicle -> Car -> 4-wheels drive` . The direct hyponym of vehicle is just car, but all the hyponyms of car includes car and 4-wheels drive, see https://stackoverflow.com/a/42012001/610569 – alvas May 12 '20 at 12:22
  • So I would thought that "level of abstracness" mean "how high is a certain object in the tree", meaning the distance from the leaf (or from the root) , assuming that as an object is closer to the root , the more abstract it is. What do you think? – Cranjis May 12 '20 at 12:31
  • Hypothesize, compute, validate/disprove the hypothesis. Repeat until one of the hypothesis work =) Have fun with it! You might find something interesting. – alvas May 12 '20 at 12:33
  • OK will do :) Do you know how I can find the distance to the root an the distance to the deepest leaf and the closest leaf? – Cranjis May 12 '20 at 12:52
  • If you use the beta standalone WN interface, there's actually a `_hyperpaths` attribute for every synset https://github.com/nltk/wordnet/blob/master/wn/synset.py#L149 – alvas May 13 '20 at 02:04
  • it seems that hypernym_paths() returns the distance from the root - but not from leaves. Is it possible to get distance from leaves? – Cranjis May 13 '20 at 07:54
  • Take a look at https://stackoverflow.com/questions/15330725/how-to-get-all-the-hyponyms-of-a-word-synset-in-python-nltk-and-wordnet You can do it, I believe in you =) – alvas May 13 '20 at 08:47
  • sorry, I didn't understand the function closure (looked also here but it really doesn't help https://www.nltk.org/howto/wordnet.html) – Cranjis May 14 '20 at 12:43