How to find abstractness of a word using hyper-/hyponyms in wordnet?

Question

I have 2 words, let's say computer and tool. Computer is a concrete noun whereas tool is relatively abstract. I want to get level of abstractness of each word that will reflect this. I thought the best way to do it is by counting number of hyper/hypo nyms for each word.

Is it possible?
Is there a better way to do it?

score 2 · Accepted Answer · answered May 11 '20 at 16:25

The first problem is which meaning of `computer` would you refer to?

In WordNet, a word has different "concepts", aka synsets:

>>> from nltk.corpus import wordnet as wn

>>> wn.synsets('computer')
[Synset('computer.n.01'), Synset('calculator.n.01')]

>>> wn.synsets('computer')[0].definition()
'a machine for performing calculations automatically'
>>> wn.synsets('computer')[1].definition()
'an expert at calculation (or at operating calculating machines)'

And hyper/hyponyms are not connected to the word `computer`

The hyper/hyponyms are concepts, i.e. synsets too, so it's not connected to the form/word but to the possible synsets that might be represented by the word computer, i.e.

>>> type(wn.synsets('computer')[0])
<class 'nltk.corpus.reader.wordnet.Synset'>

>>> wn.synsets('computer')[0].hypernyms()
[Synset('machine.n.01')]

>>> wn.synsets('computer')[0].hyponyms()
[Synset('analog_computer.n.01'), Synset('digital_computer.n.01'), Synset('home_computer.n.01'), Synset('node.n.08'), Synset('number_cruncher.n.02'), Synset('pari-mutuel_machine.n.01'), Synset('predictor.n.03'), Synset('server.n.03'), Synset('turing_machine.n.01'), Synset('web_site.n.01')]

Yes that's a lot of information but how do I get hyper/hyponyms for words?

According to the definition, should words have hyper/hyponyms? Or should concept have hypo/hypernyms?

Fine, you're bringing me in circles... Just tell me how to use hyper-/hyponyms to see if a word is more abstract than another word!!!

Okay, then we have to make some assumption.

Lets consider all synsets of a word accessed through the WordNet as a "holistic" concept of any word form
We consider the sum of all DIRECT hyper-/hyponyms of all synsets of a given word
Based on the number of hyper-/hyponyms of all synsets that can be represented by a certain word form, we deduce that word X is more/less abstract than word Y

But how to do (1), (2) and (3) in the code?

>>> hypernym_count = lambda word: sum(len(ss.hypernyms()) for ss in wn.synsets(word)) 
>>> hyponym_count = lambda word: sum(len(ss.hyponyms()) for ss in wn.synsets(word)) 

>>> hyponym_count('computer')
14
>>> hypernym_count('computer')
2


>>> hypernym_count('tool')
8
>>> hyponym_count('tool')
32

Since (3) is your hypothesis that you want to test, you should be the one deciding what heuristics to deduce if a word is more/less abstract based on the hyponym_count and hypernym_count results

Wait a minute, what's `DIRECT` hyper-/hyponyms?

We're only accessing the hyper-/hyponyms one level above/below the synset. That's what "direct" means here.

Then how to get all the hyponyms below a synset, see https://stackoverflow.com/a/42012001/610569

So should I use direct or all hyponyms below or all all hypernyms above?

That's for you to find out and tell us =) Have fun!

Thank for the great answer! can you please elaborate some more about what "direct" hyper/hyponyms means, and what are indirect? — Cranjis, May 12 '20 at 09:19
Think of these 3 level concepts: `Vehicle -> Car -> 4-wheels drive` . The direct hyponym of vehicle is just car, but all the hyponyms of car includes car and 4-wheels drive, see https://stackoverflow.com/a/42012001/610569 — alvas, May 12 '20 at 12:22
So I would thought that "level of abstracness" mean "how high is a certain object in the tree", meaning the distance from the leaf (or from the root) , assuming that as an object is closer to the root , the more abstract it is. What do you think? — Cranjis, May 12 '20 at 12:31
Hypothesize, compute, validate/disprove the hypothesis. Repeat until one of the hypothesis work =) Have fun with it! You might find something interesting. — alvas, May 12 '20 at 12:33
OK will do :) Do you know how I can find the distance to the root an the distance to the deepest leaf and the closest leaf? — Cranjis, May 12 '20 at 12:52
If you use the beta standalone WN interface, there's actually a `_hyperpaths` attribute for every synset https://github.com/nltk/wordnet/blob/master/wn/synset.py#L149 — alvas, May 13 '20 at 02:04
it seems that hypernym_paths() returns the distance from the root - but not from leaves. Is it possible to get distance from leaves? — Cranjis, May 13 '20 at 07:54
Take a look at https://stackoverflow.com/questions/15330725/how-to-get-all-the-hyponyms-of-a-word-synset-in-python-nltk-and-wordnet You can do it, I believe in you =) — alvas, May 13 '20 at 08:47
sorry, I didn't understand the function closure (looked also here but it really doesn't help https://www.nltk.org/howto/wordnet.html) — Cranjis, May 14 '20 at 12:43

How to find abstractness of a word using hyper-/hyponyms in wordnet?

1 Answers1

The first problem is which meaning of computer would you refer to?

And hyper/hyponyms are not connected to the word computer