-1

So I am banging my head on the wall over this problem, and thought I could use some help.

I have a table: Submissions, which as follows:

+-----------------+---------------+-----------+-------+
| submission_date | submission_id | hacker_id | score |
+-----------------+---------------+-----------+-------+
| 2016-03-01      |          8494 |     20703 |     0 |
| 2016-03-01      |         22403 |     53473 |    15 |
| 2016-03-01      |         23965 |     79722 |    60 |
| 2016-03-01      |         30173 |     36396 |    70 |
| 2016-03-02      |         34928 |     20703 |     0 |
| 2016-03-02      |         38740 |     15758 |    60 |
| 2016-03-02      |         42769 |     79722 |    25 |
| 2016-03-02      |         44364 |     79722 |    60 |
| 2016-03-03      |         45440 |     20703 |     0 |
| 2016-03-03      |         49050 |     36396 |    70 |
| 2016-03-03      |         50273 |     79722 |     5 |
| 2016-03-04      |         50344 |     20703 |     0 |
| 2016-03-04      |         51360 |     44065 |    90 |
| 2016-03-04      |         54404 |     53473 |    65 |
| 2016-03-04      |         61533 |     79722 |    45 |
| 2016-03-05      |         72852 |     20703 |     0 |
| 2016-03-05      |         74546 |     38289 |     0 |
| 2016-03-05      |         76487 |     62529 |     0 |
| 2016-03-05      |         82439 |     36396 |    10 |
| 2016-03-05      |         90006 |     36396 |    40 |
| 2016-03-06      |         90404 |     20703 |     0 |
+-----------------+---------------+-----------+-------+

I am trying to get the hacker_id and the number of contributions of the top contributor each day. This is where I'm at right now:

SELECT submission_date
     , hacker_id
     , COUNT(hacker_id) 
  FROM Submissions 
 GROUP 
    BY hacker_id
     , submission_date 
 ORDER 
    BY COUNT(submission_id) DESC
     , HACKER_ID;

Which gives me the number of contributors each contributor makes each day, but not an organized table with a single line per day that I am looking for. Any idea how to improve my code?

Strawberry
  • 33,750
  • 13
  • 40
  • 57
  • 1
    What is your expect result? – forpas Dec 25 '19 at 11:24
  • 1
    put your data into here and link it http://sqlfiddle.com/ – JGFMK Dec 25 '19 at 11:28
  • I guess you have to use group_concat. if you want to have single line for each day. – Long Luong Dec 25 '19 at 11:30
  • This is a faq. Before considering posting please always google any error message or many clear, concise & precise phrasings of your question/problem/goal, with & without your particular strings/names & site:stackoverflow.com & tags, & read many answers. If you post a question, use one phrasing as title. See [ask] & the voting arrow mouseover texts. This even has a tag: [tag:greatest-n-per-group]. – philipxy Dec 27 '19 at 06:31
  • Please in code questions give a [mre]--cut & paste & runnable code; example input (as initialization code) with desired & actual output (including verbatim error messages); tags & versions; clear specification & explanation. For errors that includes the least code you can give that is code that you show is OK extended by code that you show is not OK. (Debugging fundamental.) For SQL that includes DBMS & DDL, which includes constraints & indexes & tabular initialization. – philipxy Dec 27 '19 at 06:31

3 Answers3

1

I would suggest row_number():

select dh.*
from (select submission_date, hacker_id, count(*) as cnt,
             row_number() over (partition by submission_date order by count(*) desc) as seqnum
      from submissions s
      group by submission_date, hacker_id
     ) dh
where dh.seqnum = 1;

This returns one row per date. If there are ties, an arbitrary row is returned. For all values in the event of ties, use rank() instead of row_number().

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • requires mysql 8/mariadb 10.2. if using row_number, not rank, I would recommend `order by count(*) desc, hacker_id` just so the results are determinable, not arbitrary – ysth Dec 26 '19 at 19:26
0
SELECT submission_date, MAX(submissions) submissions, SUBSTR(MAX(CONCAT(LPAD(submissions,11,'0'),hacker_id)),12) hacker_id
FROM (
    SELECT submission_date, hacker_id, COUNT(hacker_id) submissions
    FROM Submissions
    GROUP BY hacker_id, submission_date
) days
GROUP BY submission_date

This attaches hacker_id to the number of submissions for each row in a group, in order to be able to select the hacker with the most submissions with max() and then extract just the hacker_id. It's a variant on https://stackoverflow.com/a/15422121 which does the same thing only looking for latest time rather than a maximum count.

If you have support for window functions (mysql 8 or higher, mariadb 10.2 or higher), see https://stackoverflow.com/a/59478665 instead. It's also possible to emulate that approach using variables in earlier versions, but usually the substring-max hack in this answer is easier and fast enough (as long as you don't want all tied rows to appear).

ysth
  • 96,171
  • 6
  • 121
  • 214
0

This works on V8. Older editions may not support Common Table Expressions (aka CTE). You didn't specify a version...

https://www.db-fiddle.com/f/cSuaJJ5WwAZt4UK2kLpMCV/0

with cte_max_score as (
  select  submission_date, max(score) as highest_score
  from Submissions
  group by submission_date
  order by submission_date
 )
 select ms.submission_date, ms.highest_score, s.hacker_id, s.submission_id 
 from cte_max_score as ms
 left join Submissions as s
 on 
 ms.submission_date = s.submission_date and
 ms.highest_score = s.score

enter image description here

If you are on an older edition, a temporary table can be used as a substitute for the CTE.

JGFMK
  • 8,425
  • 4
  • 58
  • 92
  • if you have support for CTE, you have support for window functions, and that's a much more efficient way to do it. if you don't, a subquery is more likely what you want than a temporary table. but with either, this approach can get very slow with lots of records – ysth Dec 26 '19 at 19:23