3

I have the following table -

enter image description here

My goal is to return the Company/ID row with the highest "count" respective to a partition done by the ID.

So the expected output should look like this :

My code

My current code returns a count partitioned on all ids. I just want it to return the one with the highest count.

Current code -

select distinct Company, Id, count(*) over (partition by ID)
from table1
where company in ("Facebook","Apple")

My output:

enter image description here

Void S
  • 752
  • 4
  • 14
  • You can use what you wrote as a subquery, then partition that by Company to get the max. – Barmar Feb 09 '22 at 04:39
  • Hi, can you please post what this would look like? @Barmar – Void S Feb 09 '22 at 04:40
  • See https://stackoverflow.com/questions/1313120/retrieving-the-last-record-in-each-group-mysql?rq=1 – Barmar Feb 09 '22 at 04:42
  • So I would need to use 2 partitions? The link provided does not have 2 partitions? Not sure how that would look like.. @Barmar – Void S Feb 09 '22 at 04:51
  • If we add one row with Facebook 555, then the ID 555 is the most occurring ID and you only want to output its rows (and not have Apple 123 in your results any longer). Is this correct? – Thorsten Kettner Feb 09 '22 at 05:02
  • Hi, no it should be with each company/id in mind @ThorstenKettner. In other words, it would apply to per company/id combination, and then the id that occurs greater times is what gets returned – Void S Feb 09 '22 at 05:05
  • On a side note: Double quotes are meant for names / aliases in SQL. Use single quotes for string iterals: `where company in ('Facebook','Apple')`. – Thorsten Kettner Feb 09 '22 at 05:30
  • And next time [please do not upload images of code/errors when asking a question.](//meta.stackoverflow.com/q/285551) – Thorsten Kettner Feb 09 '22 at 05:42

3 Answers3

2

Your base query is incorrect. You partition by ID regardless of the company, but in your request comments you clarify that you want to count per ID and company. This would have to be

select distinct company, id, count(*) over (partition by company, id)
from table1
where company in ('Facebook','Apple');

But that query boils down to be a mere aggregation and doesn't need window functions at all. It is evaluating the count for each single row, only to dismiss duplicates later with DISTINCT. DISTINCT is a costly operation, so why not simply aggregate your rows in the first place?

select company, id, count(*)
from table1
where company in ('Facebook','Apple')
group by company, id;

Now you only want to keep the rows with the highest count per company, and this is where window functions come into play:

select  company, id, total
from
(
  select
    company,
    id,
    count(*) as total,
    max(count(*)) over (partition by company) as max_total
  from table1
  where company in ('Facebook','Apple')
  group by company, id
) aggregated
where total = max_total;
Thorsten Kettner
  • 89,309
  • 7
  • 49
  • 73
1

We can use ROW_NUMBER here along with an aggregation query:

WITH cte AS (
    SELECT Company, ID, COUNT(*) AS Count,
           ROW_NUMBER() OVER (PARTITION BY Company ORDER BY COUNT(*) DESC) rn
    FROM table1
    GROUP BY Company, ID
)

SELECT Company, ID, Count
FROM cte
WHERE rn = 1;

Here is a running demo for MySQL.

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • Yes, impressive. It's mind reading even, because you asked for the IDs with the highest overall counts and even added a query to support this statement, when you actually wanted the IDs with the highest count per company, which you clarified in the request comments *after* Tim wrote this query :-) – Thorsten Kettner Feb 09 '22 at 05:20
-1
SELECT company, id, COUNT(*)
  FROM table1
 GROUP BY EMPNO, ENAME, JOB, MGR, HIREDATE, SAL, COMM, DEPTNO
HAVING COUNT(*) > 1;
Simas Joneliunas
  • 2,890
  • 20
  • 28
  • 35
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Feb 09 '22 at 04:59
  • This query doesn't report the companies/ids with the highest counts, but includes all with a count greater than one. This is not desired. – Thorsten Kettner Feb 09 '22 at 05:16