1

I'm trying to get max count of a field. This is what I get and what I'm tried to do.

| col1 | col2 |
|  A   |  B   |
|  A   |  B   |
|  A   |  D   |
|  A   |  D   |
|  A   |  D   |
|  C   |  F   |
|  C   |  G   |
|  C   |  F   |

I'm trying to get the max count occurrences of col2, grouped by col1.

With this query I get the occurrences grouped by col1 and col2.

SELECT col1, col2, count(*) as conta 
FROM tab 
WHERE 
GROUP by col1, col2 
ORDER BY col1, col2

And I get:

| col1 | col2 | conta |
|  A   |  B   |   2   |
|  A   |  D   |   3   |
|  C   |  F   |   2   |
|  C   |  G   |   1   |

Then I used this query to get max of count:

SELECT max(conta) as conta2, col1 
FROM (
    SELECT col1, col2, count(*) as conta 
    FROM tab 
    WHERE 
    GROUP BY col1, col2 
    ORDER BY col1, col2
) AS derivedTable 
GROUP BY col1

And I get:

| col1 | conta |
|  A   |   3   |
|  C   |   2   |

What I'm missing is the value of col2. I would like something like this:

| col1 | col2 | conta |
|  A   |  D   |   3   |
|  C   |  F   |   2   |

The problem is that if I try to select the col2 field, I get an error message, that I have to use this field in group by or aggregation function, but using it in the group by it's not the right way.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Marco
  • 31
  • 6
  • Does PostGreSQL has specificities for why your second query does not work ? I tried the exact same one (with the missing col2, and without the `WHERE` in the middle and I don't get any error message on MySQL. – Frankich Dec 16 '21 at 09:49
  • 1
    With MySQL is it possible to get field not in a group by, even if there is an aggreation of another field in the select. In PostgreSQL this is not possible, and if you have an aggregation and other fields in the select, this fields must be in an aggregation themselves or in a group by. – Marco Dec 16 '21 at 09:54

5 Answers5

2

You can combine GROUP BY with a window function - which gets evaluated after the group by:

with cte as (
  SELECT col1, col2, 
         count(*) as conta,
         dense_rank() over (partition by col1 order by count(*) desc) as rnk
  FROM tab 
  WHERE ...
  GROUP by col1, col2 
) 
select col1, col2, conta
from cte
where rnk = 1
order by col1, col2;

This will return the combination of col1,col2 with the same highest max count twice. If you don't want that, use row_number() instead of dense_rank()

Online example

2

Simpler & faster (and correct):

SELECT DISTINCT ON (col1)
       col1, col2, count(*) AS conta
FROM   tab 
GROUP  BY col1, col2 
ORDER  BY col1, conta DESC;

db<>fiddle here (based on a_horse's fiddle)

DISTINCT ON is applied after aggregation, so we don't need a subquery or CTE. Consider the sequence of events in a SELECT query:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
1

Possibly not the most elegant solution, but using a common table expression may help.

with cte as (
select col1, col2, count(*) as total
from dtable 
group by col1, col2
)
select  col1, col2, total 
from cte c
where total = (select max(total) 
           from cte cc
           where cc.col1 = c.col1)
order by col1 asc 

Returns

col1|col2|total|
----+----+-----+
 A  | D  |    3|
 C  | F  |    2|

from the docs

VynlJunkie
  • 1,953
  • 22
  • 26
  • Thank you! It works! Just 1 thing more that I'm trying to fix. Now I have a situation where there is equal total for col1 and col2, and so are selected both row. How can I keep only the first or the last one? – Marco Dec 16 '21 at 09:48
0

I misunderstood the question. Here is your solution:

;with tablex as
    (Select col1, col2, Count(col2) as Count From Your_Table Group by col1, col2),
aaaa as
    (Select ROW_NUMBER() over (partition by col1 order by Count desc) as row, * From tablex)

Select * From aaaa Where row = 1
  • That is the first query I tried but I get only the occurrences. I need the max of the occurrences and also the field col1 and col2. – Marco Dec 16 '21 at 09:23
0

Using a window function:

select distinct on (col1) col1, col2, cnt
from 
(
 select col1, col2, count(*) over (partition by col1, col2) cnt 
 from the_table
) t
order by col1, cnt desc;
col1 col2 cnt
A D 3
C F 2

This solution does not solve cases with ties.

Stefanov.sm
  • 11,215
  • 2
  • 21
  • 21
  • But that doesn't return `C,F,2` –  Dec 16 '21 at 10:09
  • @a_horse_with_no_name No, just "to get max count of a field". I was not sure what the OP needs as he also mentions "how can I keep only the first or the last one" which is `C, G, 1`. – Stefanov.sm Dec 16 '21 at 10:13
  • The question contains "*would like something like this*" which includes `C,F,2` - so for each combination of col1,col2 the one with the highest count –  Dec 16 '21 at 10:24
  • @a_horse_with_no_name Thanks, yes, I overlooked this. Answer corrected. – Stefanov.sm Dec 16 '21 at 10:28