1

I have a table (pretty big one) with lots of columns, two of them being "post" and "user". For a given "post", I want to know which "user" posted the most.

I was first thinking about getting all the entries WHERE (post='wanted_post') and then throw a PHP hack to find which "user" value I get the most, but given the large size of my table, and my poor knowledge of MySQL subtle calls, I am looking for a pure-MySQL way to get this value (the "user" id that posted the most on a given "post", basically).

Is it possible ? Or should I fall back on the hybrid SQL-PHP solution ?

Thanks, Cystack

Jonathan Hall
  • 75,165
  • 16
  • 143
  • 189
Cystack
  • 3,301
  • 5
  • 35
  • 33

1 Answers1

5

It sounds like this is what you want... am I missing something?

SELECT user
FROM myTable
WHERE post='wanted_post'
GROUP BY user
ORDER BY COUNT(*) DESC
LIMIT 1;

EDIT: Explanation of what this query does:

Hopefully the first three lines make sense to anyone familiar with SQL. It's the last three lines that do the fun stuff.

  1. GROUP BY user -- This collapses rows with identical values in the user column. If this was the last line in the query, we might expect output something like this:

    +-------+
    | user  |
    +-------+
    | bob   |
    | alice |
    | joe   |
    
  2. ORDER BY COUNT(*) DESC -- COUNT(*) is an aggregate function, that works along with the previous GROUP BY clause. It tallies all of the rows that are "collapsed" by the GROUP BY for each user. It might be easier to understand what it's doing with a slightly modified statement, and it's potential output:

    SELECT user,COUNT(*)
    FROM myTable
    WHERE post='wanted_post'
    GROUP BY user;
    
    +-------+-------+
    | user  | count |
    +-------+-------+
    | bob   |     3 |
    | alice |     1 |
    | joe   |     8 |
    

    This is showing the number of posts per user.

    However, it's not strictly necessary to actually output the value of an aggregate function in this case--we can just use it for the ordering, and never actually output the data. (Of course if you want to know how many posts your top-poster posted, maybe you do want to include it in your output, as well.)

    The DESC keyword tells the database to sort in descending order, rather than the default of ascending order.

    Naturally, the sorted output would look something like this (assuming we leave the COUNT(*) in the SELECT list):

    +-------+-------+
    | user  | count |
    +-------+-------+
    | joe   |     8 |
    | bob   |     3 |
    | alice |     1 |
    
  3. LIMIT 1 -- This is probably the easiest to understand, as it just limits how many rows are returned. Since we're sorting the list from most-posts to fewest-posts, and we only want the top poster, we just need the first result. If you wanted the top 3 posters, you might instead use LIMIT 3.

Jonathan Hall
  • 75,165
  • 16
  • 143
  • 189
  • Looks like it is. Apparently MySQL is way more powerful than I thought. Thanks again – Cystack Jul 10 '11 at 08:06
  • Wow, I don't hear people calling MySQL "powerful" very often :) – Jonathan Hall Jul 10 '11 at 08:08
  • But see, the syntax is unclear. Without the "ORDER BY" line, I'd get all the "user" that are linked to a "post", right ? How come when you add the ORDER BY, the COUNT(*) applies to each group separatey and not to the whoel request (thus returning the number of distinct "user") ? – Cystack Jul 10 '11 at 08:08
  • @Cystack: The `ORDER BY` and `LIMIT` are applied after the `GROUP BY` which is applied after the `WHERE`. See my answer, here, for the order of evaluation of clauses in queries: http://stackoverflow.com/questions/6545664/using-case-expression-column-in-where-clause/6545685#6545685 – ypercubeᵀᴹ Jul 10 '11 at 08:13
  • 1
    holy schmoly, that makes MySQL as powerful as cascading stylesheets ! Awesome ! – Cystack Jul 10 '11 at 08:14
  • @Cystack: I've added what is probably way _too_ much explanation, but I was feeling bored and/or generous... let me know if you need more clarification :) – Jonathan Hall Jul 10 '11 at 08:20
  • 2
    LOL! "holy schmoly, that makes MySQL as powerful as cascading stylesheets" -- Can I keep this quote? :) – Jonathan Hall Jul 10 '11 at 08:20
  • 1
    @Cystack: Relational databases are much more powerful than stylesheets. – ypercubeᵀᴹ Jul 10 '11 at 08:20
  • Keep all the quotes ! you gave me more than expected ! Thanks a lot for the lesson, thank you all, thank you stackoverflow ! – Cystack Jul 10 '11 at 08:24