-1

Looking to know how to query the top 10 occurrences of a value in a column .

Schema as follows:

CREATE TABLE Donations (
  id varchar(255) PRIMARY KEY,
  charityId varchar(255) NOT NULL,
  amount integer, 
  createdAt timestamp,
  updatedAt timestamp
);

Imagine this table has 1m records. Looking to know how I can get an output that would return rows that contain the the top 10 charityId's across all records; along with the total number of records that contain that charityId.

Example data would be something like the following:

| id   | charity id  | amount | created at                   | updated at                   |
|------|-------------|--------|------------------------------|------------------------------|
| "1"  | "charity-1" | 100    | "2022-03-21 04:08:53.848451" | "2022-03-21 04:08:53.848451" |
| "2"  | "charity-1" | 100    | "2022-03-21 04:08:53.848451" | "2022-03-21 04:08:53.848451" |
| "3"  | "charity-1" | 100    | "2022-03-21 04:08:53.848451" | "2022-03-21 04:08:53.848451" |
| "4"  | "charity-1" | 100    | "2022-03-21 04:08:53.848451" | "2022-03-21 04:08:53.848451" |
| "5"  | "charity-1" | 100    | "2022-03-21 04:08:53.848451" | "2022-03-21 04:08:53.848451" |
| "6"  | "charity-1" | 100    | "2022-03-21 04:08:53.848451" | "2022-03-21 04:08:53.848451" |
| "7"  | "charity-1" | 100    | "2022-03-21 04:08:53.848451" | "2022-03-21 04:08:53.848451" |
| "8"  | "charity-1" | 100    | "2022-03-21 04:08:53.848451" | "2022-03-21 04:08:53.848451" |
| "9"  | "charity-1" | 100    | "2022-03-21 04:08:53.848451" | "2022-03-21 04:08:53.848451" |
| "10" | "charity-2" | 100    | "2022-03-21 04:08:53.848451" | "2022-03-21 04:08:53.848451" |
| "11" | "charity-2" | 100    | "2022-03-21 04:08:53.848451" | "2022-03-21 04:08:53.848451" |
| "12" | "charity-2" | 100    | "2022-03-21 04:08:53.848451" | "2022-03-21 04:08:53.848451" |
| "13" | "charity-2" | 100    | "2022-03-21 04:08:53.848451" | "2022-03-21 04:08:53.848451" |
| "14" | "charity-3" | 100    | "2022-03-21 04:08:53.848451" | "2022-03-21 04:08:53.848451" |
| "15" | "charity-3" | 100    | "2022-03-21 04:08:53.848451" | "2022-03-21 04:08:53.848451" |
| "16" | "charity-3" | 100    | "2022-03-21 04:08:53.848451" | "2022-03-21 04:08:53.848451" |
| "17" | "charity-4" | 100    | "2022-03-21 04:08:53.848451" | "2022-03-21 04:08:53.848451" |
| "18" | "charity-4" | 100    | "2022-03-21 04:08:53.848451" | "2022-03-21 04:08:53.848451" |
| "19" | "charity-5" | 100    | "2022-03-21 04:08:53.848451" | "2022-03-21 04:08:53.848451" |
| "20" | "charity-5" | 100    | "2022-03-21 04:08:53.848451" | "2022-03-21 04:08:53.848451" |
| "21" | "charity-6" | 100    | "2022-03-21 04:08:53.848451" | "2022-03-21 04:08:53.848451" |

An ideal output would be something like:

| charity id  | occurrences                  |
|-------------|------------------------------|
| "charity-1" | 9                            |
| "charity-2" | 4                            |
| "charity-3" | 3                            |
| "charity-4" | 2                            |
| "charity-5" | 2                            |
Ken White
  • 123,280
  • 14
  • 225
  • 444
user2737876
  • 1,038
  • 2
  • 12
  • 20
  • Can you give some sample rows for the data? Explain with an example if possible. – Darshit Parmar Mar 21 '22 at 02:44
  • 1
    @DarshitParmar I have added an example data set of 20 rows and an expected output - hope this helps. – user2737876 Mar 21 '22 at 04:16
  • `select charity_id, count(*) as occurrencs from 'Donations' group by charity_id order by occurences desc limit 10;` – Sam Dolan Mar 21 '22 at 04:20
  • 1
    Questions asking for homework help must include a summary of the work you've done so far to solve the problem, and a description of the difficulty you are having solving it. Please read [How to ask homework questions](//meta.stackoverflow.com/q/334822) and [edit] your post. – Ken White Mar 21 '22 at 04:29
  • 1
    "Top 10" is not clearly defined without a tiebreaker. What if no. 10 and 11 share the same number of occurrences? – Erwin Brandstetter Mar 21 '22 at 05:07

3 Answers3

2

Since no tiebreaker has been defined, return results "with ties":

SELECT charityid, count(*) AS occurrencs
FROM   donations
GROUP  BY charityid
ORDER  BY occurrencs DESC  -- no tiebeaker?
FETCH  FIRST 10 ROWS WITH TIES;

See:

And use count(*) here. Faster, and equivalent while no null values are involved.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
0

By use of group by , count :

SELECT charityId , count(charityId )  As Occurrencs 
    from Donations 
    group by charityId 
    Order by Occurrencs Desc , charityId 
    LIMIT 10
Damini Suthar
  • 1,470
  • 2
  • 14
  • 43
0

basic query :

  • Aggregation by "charityId"
  • Count rows
  • Order by count desc
  • Keep only first 10 reccords
SELECT "charityId" as "charity id", count(*)  As "occurrences" 
FROM "Donations" 
GROUP BY 1
ORDER BY 2 DESC
LIMIT 10
JB D
  • 101
  • 2