This question is old, but was referenced in a new question on dba.SE. I feel the best solutions haven't been provided. Plus, there are new, faster options.
Question in the title
Can I do a max(count(*))
in SQL?
Yes, you can achieve that by nesting an aggregate function in a window function:
SELECT m.yr
, count(*) AS movie_count
, max(count(*)) OVER () AS max_ct
FROM casting c
JOIN movie m ON c.movieid = m.id
WHERE c.actorid = (SELECT id FROM actor WHERE name = 'John Travolta')
GROUP BY m.yr
ORDER BY count(*) DESC;
db<>fiddle here
That's standard SQL. Postgres introduced it with version 8.4 (released 2009-07-01, before this question was asked. Other RDBMS should be capable of the same.
Consider the sequence of events in a SELECT
query:
Possible downside: window functions do not aggregate rows. You get all rows left after the aggregate step. Useful in some queries, but not ideal for this one.
To get one row with the highest count, you can use ORDER BY ct DESC FETCH FIRST 1 ROW ONLY
:
SELECT c.yr, count(*) AS ct
FROM actor a
JOIN casting c ON c.actorid = a.id
WHERE a.name = 'John Travolta'
GROUP BY c.yr
ORDER BY ct DESC
FETCH FIRST 1 ROW ONLY;
Using only basic SQL features, available in any halfway decent RDBMS. Most popular RDBMS (also) support alternative syntax for FETCH FIRST
with LIMIT
, TOP
or ROWNUM
. See:
Or you can get one row per group with the highest count with DISTINCT ON
(only Postgres):
Actual Question
I need to get the rows for which count(*)
is max.
There may be more than one row with the highest count.
SQL Server has had the feature WITH TIES
for some time - with non-standard syntax:
SELECT TOP 1 WITH TIES
m.yr, count(*) AS movie_count
FROM casting c
JOIN movie m ON c.movieid = m.id
WHERE c.actorid = (SELECT id FROM actor WHERE name = 'John Travolta')
GROUP BY m.yr
ORDER BY count(*) DESC; -- can't sort by year for this
db<>fiddle here
PostgreSQL 13 added WITH TIES
with standard SQL syntax:
SELECT m.yr, count(*) AS movie_count
FROM casting c
JOIN movie m ON c.movieid = m.id
WHERE c.actorid = (SELECT id FROM actor WHERE name = 'John Travolta')
GROUP BY m.yr
ORDER BY count(*) DESC -- can't sort by year for this
FETCH FIRST 1 ROWS WITH TIES;
db<>fiddle here
This should be the fastest possible query. Further reading:
To sort results by additional criteria (or for older versions of Postgres or other RDBMS without WITH TIES
), use the window function rank()
in a subquery:
SELECT yr, movie_count
FROM (
SELECT m.yr, count(*) AS movie_count
, rank() OVER (ORDER BY count(*) DESC) AS rnk
FROM casting c
JOIN movie m ON c.movieid = m.id
WHERE c.actorid = (SELECT id FROM actor WHERE name = 'John Travolta')
GROUP BY m.yr
) sub
WHERE rnk = 1
ORDER BY yr; -- optionally sort by year
All major RDBMS support window functions nowadays.