0

I'm trying to return all data when one column has more than 4 unique values.

I've altered the names of the columns and table here, but basically I want all records but only for each "ad_id" with more than 4 unique values in "prof_name"

SELECT ad_id, prof_name  
FROM PROF  
GROUP BY ad_id, prof_name  
HAVING COUNT(prof_name) > 4  

I'm expecting this to return all data when there are more than 4 unique values for prof_name column? What am I doing wrong?

Igor
  • 60,821
  • 10
  • 100
  • 175
  • Does this answer your question? [SQL to find the number of distinct values in a column](https://stackoverflow.com/questions/141562/sql-to-find-the-number-of-distinct-values-in-a-column) – Igor Aug 02 '23 at 13:59
  • ^ You can then also apply the HAVING clause to that. – Igor Aug 02 '23 at 14:00
  • Your query is counting a single value of `ad_id` has 5, or more, rows with the *same* value of `prof_name`, not 5 *different* values. You don't want to `GROUP BY` the column `prof_name` and you should be `COUNT`ing the `DISTINCT` values. – Thom A Aug 02 '23 at 14:14

1 Answers1

1

It's really helpful when you provide demo DDL/DML. It makes it much easier for people to answer your question.

Here's an example:

DECLARE @Prof TABLE (ad_id INT, prof_name VARCHAR(20));
INSERT INTO @Prof (ad_id, prof_name) VALUES
(1, 'Joe'),(2, 'Joe'),(3, 'Joe'),(4, 'Joe'),(5, 'Joe'),
(1, 'Bill'),(2, 'Bill'),(3, 'Bill'),(4, 'Bill'),
(1, 'John'),(2, 'John'),(3, 'John'),(4, 'John'),
(1, 'Mick'),(2, 'Mick'),(3, 'Mick'),
(1, 'Ringo'),(2, 'Mick'),(3, 'Joe');

Using this, we can demonstrate the difference between COUNT() and COUNT(DISTINCT ):

SELECT ad_id, COUNT(prof_name) AS Cnt, COUNT(DISTINCT prof_name) AS dCnt
  FROM @Prof
 GROUP BY ad_id
ad_id Cnt dCnt
1 5 5
2 5 4
3 5 4
4 3 3
5 1 1

Now, if we want just the rows from the table which have a distinct count of 4 (inferred from your description), we can use a CTE to get those, and join them to the table:

;WITH IDs AS (
SELECT ad_id, COUNT(prof_name) AS Cnt, COUNT(DISTINCT prof_name) AS dCnt
  FROM @Prof
 GROUP BY ad_id
 HAVING COUNT(DISTINCT prof_name) = 4
)

SELECT p.ad_id, p.prof_name
  FROM @Prof p
    INNER JOIN IDs i
      ON p.ad_id = i.ad_id
ad_id prof_name
2 Mick
2 Mick
2 John
2 Bill
2 Joe
3 Joe
3 Bill
3 John
3 Mick
3 Joe
Patrick Hurst
  • 2,086
  • 1
  • 3
  • 13