2

I have a use case where I have a word and I need to know the following things:

  1. Synonyms for the word (just the synonyms are sufficient)
  2. All senses of the word, where each sense contains - the synonyms matching that word in that sense, example sentences in that sense (if there), the part of speech for that sense.

Example - this query link. Screenshot for the word carry:

enter image description here

For each 'sense', we have the part of speech (like V), synonyms matching that sense, (like transport in the first sense, pack, take in the second sense, etc), example sentences containing that word in that sense (This train is carrying nuclear waste, carry the suitcase to the car, etc in first sense, I always carry money etc in the second sense, etc.).

How do I do this from a Wordnet MySQL database? I ran this query, it returns the list of meanings for the word:

SELECT a.lemma, c.definition FROM words a INNER JOIN senses b ON a.wordid = b.wordid INNER JOIN synsets c ON b.synsetid = c.synsetid WHERE a.lemma = 'carry';

How do I get the synonyms, example sentences, part of speech and synonyms specific to that sense for each sense? I queried the vframesentences and vframesentencemaps tables, saw example sentences with placeholders like %s, and based on the wordid column I tried to match them with the words table, but got awfully wrong results.

Edit:

For the word carry, if I run these queries, I get synonyms and sense meanings correctly:

1. select * from words where lemma='carry' //yield wordid as 21354
2. select * from senses where wordid=21354 //yield 41 sysnsetids, like 201062889
3. select * from synsets where synsetid=201062889 //yields the explanation "serve as a means for expressing something"
4. select * from senses where synsetid=20106288` /yields all matching synonyms for that sense as wordids, including "carry" - like 21354, 29630, 45011
5. select * from words where wordid=29630 //yields 'convey'

So all I now need is a way of finding the example sentence for the word carry in each of the 41 senses. How do I do it?

Steve Chambers
  • 37,270
  • 24
  • 156
  • 208
SexyBeast
  • 7,913
  • 28
  • 108
  • 196
  • What does "c.definition" give you? I'm not familiar with WordNet database. I can help you more if you give me online documentation about how wordnet database tables look like. "c.definition" gives you a list of bullet points like the following? `S: (v) transport, carry (move while supporting, either in a vehicle or in one's hands or on one's body) "You must carry your camping gear"; "carry the suitcases ..."` `S: (v) carry, pack, take (have with oneself; have on one's person) "She always takes an umbrella"; "I always carry money"; "She packs ..." ...` – Tin Feb 26 '16 at 20:53
  • I think `definition` provides just the definition.. – SexyBeast Feb 27 '16 at 03:24

1 Answers1

4

You can get the sentences from the samples table. E.g:

SELECT sample FROM samples WHERE synsetid = 201062889;

yields:

The painting of Mary carries motherly love

His voice carried a lot of anger

So you could extend your query as follows:

SELECT 
    a.lemma AS `word`,
    c.definition,
    c.pos AS `part of speech`,
    d.sample AS `example sentence`,
    (SELECT 
            GROUP_CONCAT(a1.lemma)
        FROM
            words a1
                INNER JOIN
            senses b1 ON a1.wordid = b1.wordid
        WHERE
            b1.synsetid = b.synsetid
                AND a1.lemma <> a.lemma
        GROUP BY b.synsetid) AS `synonyms`
FROM
    words a
        INNER JOIN
    senses b ON a.wordid = b.wordid
        INNER JOIN
    synsets c ON b.synsetid = c.synsetid
        INNER JOIN
    samples d ON b.synsetid = d.synsetid
WHERE
    a.lemma = 'carry'
ORDER BY a.lemma , c.definition , d.sample;

Note: The subselect with a GROUP_CONCAT returns the synonyms of each sense as a comma-separated list in a single row in order to cut down on the number of rows. You could consider returning these in a separate query (or as part of this query but with everything else duplicated) if preferred.

UPDATE If you really need synonyms as rows in the results, the following will do it but I wouldn't recommend it: Synonyms and example sentences both pertain to a particular definition so the set of synonyms will be duplicated for each example sentence. E.g. if there are 4 example sentences and 5 synonyms for a particular definition, the results would have 4 x 5 = 20 rows just for that definition.

SELECT 
    a.lemma AS `word`,
    c.definition,
    c.pos AS `part of speech`,
    d.sample AS `example sentence`,
    subq.lemma AS `synonym`
FROM
    words a
        INNER JOIN
    senses b ON a.wordid = b.wordid
        INNER JOIN
    synsets c ON b.synsetid = c.synsetid
        INNER JOIN
    samples d ON b.synsetid = d.synsetid
        LEFT JOIN
    (SELECT 
        a1.lemma, b1.synsetid
    FROM
        senses b1
    INNER JOIN words a1 ON a1.wordid = b1.wordid) subq ON subq.synsetid = b.synsetid
        AND subq.lemma <> a.lemma
WHERE
    a.lemma = 'carry'
ORDER BY a.lemma , c.definition , d.sample;
Community
  • 1
  • 1
Steve Chambers
  • 37,270
  • 24
  • 156
  • 208
  • Exactly what I needed! Thanks. You are right, I need the synonyms separately, let the results be duplicated. Of course I can get them in a separate query, or make it ugly and extract/comma parse the result to get individual synonyms, but better if they come from the query itself. Or is it guaranteed that no synonym will not contain commas? – SexyBeast Feb 27 '16 at 03:36
  • By the way, the synonyms that return are all lower-cased. But in actual website if you search for say `Jesus`, it returns results like `Christ`, `Deliverer` etc all capitalized. How do I get properly cased synonyms? – SexyBeast Feb 27 '16 at 07:48
  • Synonyms are just words and none contain commas: `SELECT lemma FROM words WHERE lemma LIKE '%,%';` returns no results. Have updated the answer to include a way of getting synonyms for each sense in individual rows. Re: Capitalization - unfortunately this seems to be how the words are stored in the database (see mysql-wn-data.sql) so there's not much that can be done other than applying capitalization to all synonyms. Only thing to suggest is perhaps try a different [database flavour](https://sourceforge.net/projects/wnsql/files/wnsql3/) (sqlite / standard) and see if it has the same issue. – Steve Chambers Feb 27 '16 at 11:10
  • By the way, it is not related to this question, but still want to know, how do I find different forms of the same word? For example, the word `show` is absent in the table `words`. However its other form `show` is there. How do I query something like if one form is not there, return the form present? – SexyBeast Mar 02 '16 at 05:19
  • And even common words like `and`, `is` are missing. Can Wordnet return their meanings in any way? – SexyBeast Mar 02 '16 at 05:46
  • how to get only few rows from the result ? – VINNUSAURUS Jan 09 '19 at 08:25
  • 1
    Do you mean you want to limit the number of rows returned? If so, use a [`LIMIT`](https://dev.mysql.com/doc/refman/5.5/en/select.html#idm139630354050064) clause. – Steve Chambers Jan 09 '19 at 08:37