0

I have a table orders:

|--------------------|-----------------|------------|
|      user_id       |    article_id   |  Name      |
|--------------------|-----------------|------------|
|         1          |         39      |  John      |
|         1          |         39      |  John      |
|         2          |         39      |  Mike      |
|         2          |         19      |  Mike      |
|         3          |         39      |  Dave      |
|--------------------|-----------------|------------|

How can I select only those users who have article_id = 39, regardless of how many times, but if it occurs only once. I would need to select user John and user Dave, but not Mike. Mike has article_id 39, but he also has article_id = 19. The query should return users John and Dave.

I have tried this query:

SELECT * FROM orders AS o where o.article_id = 39
GROUP BY o.user_id HAVING COUNT(o.article_id) > 0

But it return all three users. I don't need user Mike.

Strawberry
  • 33,750
  • 13
  • 40
  • 57
Amiga500
  • 5,874
  • 10
  • 64
  • 117

3 Answers3

6

Use conditional aggregation in the HAVING clause:

SELECT user_id, Name
FROM orders
GROUP BY user_id, Name
HAVING SUM(CASE WHEN article_id <> 39 THEN 1 ELSE 0 END) = 0;

Note: Your table is not normalized, and the user information belongs in a separate user table. I was able to get around this by including the name in the GROUP BY clause, but rightfully we should just be aggregating by the user_id.

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
3

Tim's answer above is the most efficient way to get what you're looking for. Another solution to this is to use a nested query.

First, you select all users who don't have article_id 39:

SELECT user_id FROM orders WHERE article_id <> 39

Then select users who have an article_id of 39 that are not in this list. Overall, the query would be:

SELECT user_id FROM orders WHERE article_id = 39 
AND user_id NOT IN (SELECT user_id from orders where article_id <> 39)

Again, the answers already posted by Tim and John are more efficient and better practice. I just find queries written like this easier to read/understand when learning SQL.

rochy_01
  • 161
  • 1
  • 8
  • Well written +1 ... and there may be situations where this approach is preferable over an aggregation query. – Tim Biegeleisen Feb 06 '18 at 08:24
  • I'd like to see some evidence of the relative efficiency of these solutions. Do you have any? – Strawberry Feb 06 '18 at 08:28
  • Perhaps, it was too general a statement to make, that this solution wouldn't be as efficient. Regarding evidence, I don't have any. It was just my understanding that nested queries should be avoided if possible. Maybe I am wrong in this. There are [other posts](https://stackoverflow.com/questions/17477340/performance-of-nested-select) that discuss the performance of nested queries in more detail. – rochy_01 Feb 06 '18 at 08:39
1

Another way, is to use MIN() and MAX()

SELECT  user_id
FROM    orders
GROUP   BY Name
HAVING  MIN(article_id) = MAX(article_id)
        AND MIN(article_id) = 39
John Woo
  • 258,903
  • 69
  • 498
  • 492