2

I have a database of baseball plays with a PlayerID and a TypeID (the kind of play: double, strike out, etc). The data looks something like this:

+----------+--------+
| playerid | typeid |
+----------+--------+
|        2 |      4 |
|        2 |      4 |
|        2 |      7 |
|        3 |      7 |
|        3 |      7 |
|        3 |      7 |
|        3 |     26 |
|        3 |      7 |

I'm trying to find which players had the most of each kind of play. E.g. Jim (PlayerID 3) had the most strike outs (TypeID 7) and Bob (PlayerID 2) had the most home runs (TypeID 4) which should result in the following table:

+----------+--------+----------------+
| playerid | typeid | max(playcount) |
+----------+--------+----------------+
|        2 |      4 |             12 |
|        3 |      7 |              9 |
|        3 |     26 |              1 |

My best attempt so far is to run:

SELECT playerid,typeid,MAX(playcount) FROM 
(
SELECT playerid,typeid,COUNT(*) playcount FROM plays GROUP BY playerid,typeid
) AS t GROUP BY typeid;

Which returns the proper maximums of each type, but the associated PlayerIDs are all wrong and I can't figure out why. I'm sure I'm missing something simple (or making this overly complicated) but can't figure it out. Any ideas?

vityav
  • 268
  • 3
  • 14

3 Answers3

2

In MySQL this group=wise maximum it is sadly not a simply as you want it to be.

Here's a way to do it using a method similar to what is suggested in ROW_NUMBER() in MySQL

SELECT a.*
  FROM (
        SELECT playerid
              ,typeid
              ,COUNT(*) playcount 
          FROM plays 
         GROUP BY playerid,typeid
        ) a
  LEFT JOIN
        (
        SELECT playerid
              ,typeid
              ,COUNT(*) playcount 
          FROM plays 
         GROUP BY playerid,typeid
        ) b
    ON a.typeid = b.typeid
   AND a.playcount < b.playcount
WHERE b.playerid IS NULL
Community
  • 1
  • 1
Karl Kieninger
  • 8,841
  • 2
  • 33
  • 49
  • Other than "@lays" instead of "plays", it's perfect. I kept seeing this sort of thing and not realizing it actually did what I was trying to do. Thanks! – vityav Jun 16 '15 at 03:49
1

Would this work?

SELECT
    playertypecounts.*
FROM
    (SELECT
       playerid,
       typeid,
       COUNT(*) as playcount
     FROM plays
     GROUP BY playerid, typeid) playertypecounts
INNER JOIN
    (SELECT
        typeid,
        MAX(playcount) as maxplaycount
     FROM 
        (SELECT
           playerid,
           typeid,
           COUNT(*) as playcount
         FROM plays
         GROUP BY playerid, typeid) playcounts
     GROUP BY typeid) maxplaycounts
ON playertypecounts.typeid = maxplaycounts.typeid
AND playertypecounts.playcount = maxplaycounts.maxplaycount

This part of the query block returns the maximum playcount for each typeid:

(SELECT
    typeid,
    MAX(playcount) as maxplaycount
 FROM 
    (SELECT
       playerid,
       typeid,
       COUNT(*) as playcount
     FROM plays
     GROUP BY playerid, typeid) playcounts
 GROUP BY typeid) maxplaycounts

Then it's inner-joined to all the typeid/playcounts in order to filter those counts where the player(s) have the maximum counts for any given typeid.

See SQLFiddle example.

Having said all that, I actually prefer @KarlKieninger's answer since it's more elegant.

Alex Woolford
  • 4,433
  • 11
  • 47
  • 80
  • I'm not quite sure what this returned. It was in the right format, but none of the data aligned (wrong players had the wrong types and the maximums for each type were wrong). It looks almost just like the other answer that worked other than being an inner join and selecting the max. Not sure why that would mess with it? – vityav Jun 16 '15 at 03:54
1

you have to put playerid column also in group by clause. rest all is ok.

SELECT playerid,typeid,MAX(playcount) FROM 
(
SELECT playerid,typeid,COUNT(*) playcount FROM plays GROUP BY playerid,typeid
) AS t GROUP BY playerid,typeid;
Mike Clark
  • 1,860
  • 14
  • 21