1

I read several questions about this issue on stackoverflow : It seems that COUNT should be used with the right Joining to display sums of every items, including the ones summing zero.

I'm not really able to make it with my case, after several hours of headache...

Well, I have 2 tables. The first one is called "words2", with a list of words. The second one is called "links2". It's linking together two words : idWord1 and idWord2. (There are no links linking together two identical words)

For each word, I would like to know how many links are used, even if there is no link.

This is my query :

SELECT *, COUNT(*) AS qty  
FROM ( 
    SELECT *
    FROM words2
    LEFT OUTER JOIN links2 AS linksA ON words2.idWord = linksA.idWord1 

    UNION

    SELECT *
    FROM words2
    LEFT OUTER JOIN links2 AS linksB ON words2.idWord = linksB.idWord2
) AS tmp
WHERE idUser = 3 AND linkType = 'individual'
GROUP BY word ORDER BY word

It works fine unless I don't have any results for the unused words, which are not displayed.

Thank you very much for your help!

ben
  • 1,946
  • 2
  • 18
  • 26
Yako
  • 3,405
  • 9
  • 41
  • 71

2 Answers2

3

To do this with your original query change the count call to COUNT(idWord1). This will cause it to count the number of times idWord1 is NOT NULL. Right now it is counting the number of rows period so you get a 1 where you should get a zero.

Here's my sample dataset:

words2
-------
idWord
-------
foo
bar
baz
biz
buzz

links2
-------
idWord1 | idWord2
-------
foo     | bar
foo     | baz
bar     | baz
buzz    | foo
buzz    | bar

(This dataset disregards the idUser and linkType fields because your original question doesn't describe how they are used and they don't appear to be relevant to the answer.)

When I run your query on my dataset I get this:

idWord | idWord1 | idWord2 | linkCount
--------------------------------------
bar    | bar     | baz     | 3
baz    | NULL    | NULL    | 2
biz    | NULL    | NULL    | 1
buzz   | buzz    | foo     | 2
foo    | foo     | bar     | 3

Also note that COUNT(*) will be more expensive depending upon the storage engine you're using. See this other question for details.

When I change the count to COUNT(idWord1) i get this:

idWord | idWord1 | idWord2 | linkCount
--------------------------------------
bar    | bar     | baz     | 3
baz    | NULL    | NULL    | 2
biz    | NULL    | NULL    | 0
buzz   | buzz    | foo     | 2
foo    | foo     | bar     | 3

Here's an even simpler query that uses no subquery and joins words2 to links2 using an OR statement:

SELECT
  words2.idWord
  -- this will count the number of links to each word
  -- if there are no links the COUNT() call will return 0
  , COUNT(idWord1) AS linkCount
FROM words2
  LEFT JOIN links2
    ON words2.idWord = links2.idWord1
      OR words2.idWord = links2.idWord2
GROUP BY words2.idWord
ORDER by words2.idWord

When run on the sample dataset I get the following results:

idWord | linkCount
-------------------
bar    | 3
baz    | 2
biz    | 0
buzz   | 2
foo    | 3
Community
  • 1
  • 1
benrifkah
  • 1,526
  • 14
  • 31
  • Thank you for your answer, which is quite pedagogical. However, I don't understand why I don't have the same results as you... – Yako Jul 02 '11 at 09:20
  • OK. I got it. Your answer is correct. This is because I added a WHERE condition, and I did not explained this in the question, as I thought it was not relevant. Each link also has an idUser and a linkType. If I use a WHERE condition, then I lose some of the words... Do you have an idea of how this could be fixed ? – Yako Jul 02 '11 at 09:28
  • Maybe something like this for the WHERE, but linkCount doesn't seem to be understood in the query `code`WHERE (idUser = 3 AND linkType = 'individual') OR linkCount = 0 – Yako Jul 03 '11 at 09:29
0
SELECT
    w.idWord
  , ( SELECT COUNT(*) 
      FROM links2 AS l
      WHERE l.idWord1 = w.idWord
    ) +
    ( SELECT COUNT(*) 
      FROM links2 AS l
      WHERE l.idWord2 = w.idWord
    ) AS linkCount
FROM words2 AS w

or

SELECT
    w.idWord
  , ( SELECT COUNT(*) 
      FROM links2 AS l
      WHERE l.idWord1 = w.idWord
         OR l.idWord2 = w.idWord
    ) AS linkCount
FROM words2 AS w
ypercubeᵀᴹ
  • 113,259
  • 19
  • 174
  • 235