3

Problem: I want to list n number of games from each genre (order not important)

The following MySQL query resides inside a ColdFusion function. It is meant to list all games under a platform (for example, list all PS3 games; list all Xbox 360 games; etc...). The variable for PlatformID is passed through the URL. I have 9 genres, and I would like to list 10 games from each genre.

        SELECT
            games.GameID AS GameID,
            games.GameReleaseDate AS rDate,                
            titles.TitleName AS tName,
            titles.TitleShortDescription AS sDesc,
            genres.GenreName AS gName,
            platforms.PlatformID,
            platforms.PlatformName AS pName,
            platforms.PlatformAbbreviation AS pAbbr
        FROM
            (((games join titles on((games.TitleID = titles.TitleID))) join genres on((genres.GenreID = games.GenreID))) join platforms on((platforms.PlatformID = games.PlatformID)))
        WHERE
            (games.PlatformID = '#ARGUMENTS.PlatformID#')
        ORDER BY
            GenreName ASC,
            GameReleaseDate DESC

Once the query results come back I group them in ColdFusion as follows:

<cfoutput query="ListGames" group="gName"> (first loop which lists genres)
#ListGames.gName#
      <cfoutput> (nested loop which lists games)
      #ListGames.tName#
      </cfoutput>
</cfoutput>

The problem is that I only want 10 games from each genre to be listed. If I place a "limit" of 50 in the SQL, I will get ~ 50 games of the same genre (depending on how much games of that genre there are). The second issue is I don't want the overload of querying the database for all games when each person will only look at a few.

What is the correct way to do this?

Many thanks!

Mohamad
  • 34,731
  • 32
  • 140
  • 219
  • Sorry, I edited the question. It's MySQL. – Mohamad Mar 06 '10 at 22:19
  • Which ten games? Random 10? Any 10? First 10 when sorted by something? – Mark Byers Mar 06 '10 at 22:19
  • Maybe a combination of GROUP BY and LIMIT is your solution. – duffymo Mar 06 '10 at 22:31
  • Sorry, I just became aware that I posted the wrong query, which is why it lacked so many details. My apologies, and I've updated it. – Mohamad Mar 06 '10 at 23:07
  • @Mel: The results were not what I expect - the rank should be increasing. It doesn't make sense to get that output, using genrename or genreid. – OMG Ponies Mar 07 '10 at 01:30
  • @OMG Ponies, do you think there is something wrong with my table structure? – Mohamad Mar 07 '10 at 01:39
  • @Mel: No, your table structure strikes me as 3NF - as good as I can hope for. I'd read that the variable approach can be unreliable, and I admit I've only used it on very small datasets. MySQL's lack of functionality leads me to reach for Postgres (or free versions of Oracle & SQL Server) when I encounter situations like these :( – OMG Ponies Mar 07 '10 at 01:48
  • Thanks for looking at this. I suppose I will have to think of another way to handle this! FYI, I have not enforced any foreign keys in my DB yet. I don't know if this affects what I'm trying to do, but I thought you should know. I find foreign keys annoying when in development stage, so I will only enforce them when my table structure if finalized. – Mohamad Mar 07 '10 at 03:01

5 Answers5

3

I'm a little rusty so I'm sure there's some bugs and syntax errors in here. This stored procedure should create a temporary table with the data you're looking for

CREATE PROCEDURE topbygenre(num INT, platformID INT)
BEGIN
  DECLARE done INT DEFAULT 0;
  DECLARE a INT;
  DECLARE cur CURSOR FOR SELECT GenreID FROM genres;
  DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;
  DECLARE first INT DEFAULT 1;

  OPEN cur;

  REPEAT
    FETCH cur INTO a;
    IF NOT done THEN
      IF first THEN
        CREATE TEMPORARY TABLE TopGames
        SELECT
          games.GameID AS GameID,
          games.GameReleaseDate AS rDate,                
          titles.TitleName AS tName,
          titles.TitleShortDescription AS sDesc,
          genres.GenreName AS gName,
          platforms.PlatformID,
          platforms.PlatformName AS pName,
          platforms.PlatformAbbreviation AS pAbbr
        FROM
          (((games join titles on((games.TitleID = titles.TitleID))) join genres on((genres.GenreID = games.GenreID))) join platforms on((platforms.PlatformID = games.PlatformID)))
        WHERE
          (games.PlatformID = platformID)
          AND games.GenreID = a
        ORDER BY
          GenreName ASC,
          GameReleaseDate DESC
        LIMIT num;
        SET first = 0;
      ELSE
        INSERT INTO TopGames
        SELECT
          games.GameID AS GameID,
          games.GameReleaseDate AS rDate,                
          titles.TitleName AS tName,
          titles.TitleShortDescription AS sDesc,
          genres.GenreName AS gName,
          platforms.PlatformID,
          platforms.PlatformName AS pName,
          platforms.PlatformAbbreviation AS pAbbr
        FROM
          (((games join titles on((games.TitleID = titles.TitleID))) join genres on((genres.GenreID = games.GenreID))) join platforms on((platforms.PlatformID = games.PlatformID)))
        WHERE
          (games.PlatformID = platformID)
          AND games.GenreID = a
        ORDER BY
          GenreName ASC,
          GameReleaseDate DESC
        LIMIT num;
      END IF;
    END IF;
  UNTIL done END REPEAT;

  CLOSE cur;
END

Then execute tihs every time you need the data:

topbygenre(10, '#ARGUMENTS.PlatformID#');
select * from TopGames;
Michael Lenzen
  • 822
  • 8
  • 7
  • @LenzM, wow, I had no idea this could be so complicated. Imagined there would be a simpler solution. Many thanks for taking the time to write this code. I will test it and let you know how it works. – Mohamad Mar 07 '10 at 21:00
1

You could do a Coldfusion query of queries, but I believe that the solution resides in MySQL. That way you have the best control over the results you get back and frankly MySQL will sort and group this type of data much faster than Coldfusion.

Either you could get fancy with some grouping, limit and possibly a subselect. Alternately you can split your select statements up into one query per genre. I don't know how many people you'll have viewing this database, but 9 queries that each return 10 results is kids play for the database server. I'm sure it would be preferable to have the 90 results return in a single query for simplicity of output. That can be done, but the SELECT statement is going to be much more complicated. If you go that route, I'd recommend a stored procedure.

Dan Sorensen
  • 11,403
  • 19
  • 67
  • 100
  • Dan, thank you for your input. Could you please explain why and how a stored procedure would provide a better solution? Isn't a stored procedure just SQL stored in the database rather than on a page? Sorry if my question sounds stupid, but I consider myself a novice. – Mohamad Mar 07 '10 at 19:10
  • Not stupid question at all, either one would work. Using a stored procedure would allow you to do some more complicated logic and return a single query in the order you'd like to see straight from the database. You could do the same thing with a query of queries on the Coldfusion side if you'd like. For a small dataset like this, it's mainly up to your preference where you want to put the logic. a) internal to the database and return a straight forward dataset or b) return raw data to Coldfusion and process it there where you may be more comfortable handling the logic, debugging and changes. – Dan Sorensen Mar 08 '10 at 03:46
1

In Oracle I would probably play with Groups HAVING(ROWNUM) <= 10 but I do not think MYSQL supports that.

If you have too many rows to just pull all games down and iterate in group logic then I agree with the above poster that breaking it out into separate queries while not optimum may be fine if you are not running a lot of concurrent users and use a bit of caching.

My first try would be something like this - while not tested may give you some ideas. so let me know if this is close - at least it is simple (note the maxrows="10" and check that it respects the Order By)...

 <cfquery name="Genres" datasource="#Application.DB#" cachedWithin="#createTimeSpan(0,0,30,0)#">
 SELECT 
   Distinct (GenreName) as UniqueGenreName
 FROM
   games join genres on(genres.GenreID = games.GenreID)
 WHERE
   games.PlatformID = <CFQUERYPARAM VALUE="#ARGUMENTS.PlatformID#" CFSQLTYPE="CF_SQL_VARCHAR">
 ORDER BY
   GenreName
 </cfquery>

 <!--- Table header here --->

 <cfloop query ="Genres">

   <cfquery name="SubGenres" datasource="#Application.DB#" maxrows="10" cachedWithin="#createTimeSpan(0,0,30,0)#">
    SELECT
             games.GameID AS GameID,
             games.GameReleaseDate AS rDate,                
             titles.TitleName AS tName,
             titles.TitleShortDescription AS sDesc,
             genres.GenreName AS gName,
             platforms.PlatformID,
             platforms.PlatformName AS pName,
             platforms.PlatformAbbreviation AS pAbbr
         FROM
             (((games join titles on((games.TitleID = titles.TitleID))) join genres on((genres.GenreID = games.GenreID))) join platforms on((platforms.PlatformID = games.PlatformID)))
         WHERE
             games.PlatformID = <CFQUERYPARAM VALUE="#ARGUMENTS.PlatformID#" CFSQLTYPE="CF_SQL_VARCHAR"> AND genres.GenreName = <CFQUERYPARAM VALUE="#UniqueGenreName#" CFSQLTYPE="CF_SQL_VARCHAR">
         ORDER BY
             GameReleaseDate DESC
    </cfquery>

    <cfoutput query ="SubGenres">
    <!--- Table rows here --->
    </cfoutput>

 </cfloop>

 <!--- Table footer here --->
kevink
  • 1,958
  • 3
  • 14
  • 14
1

You say you want 10 games from each genre, but don't specify which 10 games you want. I assume from the ordering (GenreName, GameReleaseDate) that you want to display the 10 most recently released games.

It sounds like you want to use a SELECT TOP n - type clause. Someone else on SO asked about a database-agnostic method for selecting the top n records.

Using the answer from that question, you could try this:

    SELECT
        games.GameID AS GameID,
        games.GameReleaseDate AS rDate,                
        titles.TitleName AS tName,
        titles.TitleShortDescription AS sDesc,
        genres.GenreName AS gName,
        platforms.PlatformID,
        platforms.PlatformName AS pName,
        platforms.PlatformAbbreviation AS pAbbr
    FROM
        (((games join titles on((games.TitleID = titles.TitleID))) join genres on((genres.GenreID = games.GenreID))) join platforms on((platforms.PlatformID = games.PlatformID)))
    WHERE
        (games.PlatformID = '#ARGUMENTS.PlatformID#')
        AND (SELECT COUNT(*) FROM games AS NewerGames where NewerGames.PlatformID = games.PlatformID AND games.GenreID=NewerGames.GenreID AND (NewerGames.GameReleaseDate < Games.GameReleaseDate OR (NewerGames.GameReleaseDate = Games.GameReleaseDate AND NewerGames.GameID > games.GameID))) <= 10
    ORDER BY
        GenreName ASC,
        GameReleaseDate DESC

I haven't tested it but I'm pretty sure it should work. Give it a try, if it works it's a simple answer to the problem.

Note that one of the criteria -- NewerGames.GameID > games.GameID is there solely to avoid the problem of there being more than one game with the same release date. In these cases, the game that has a higher id (and therefore, in theory, slightly "younger") will get filtered out last.

Community
  • 1
  • 1
Jordan Reiter
  • 20,467
  • 11
  • 95
  • 161
  • Thanks Jordan. It works but the older release dates are coming back. I revered the rdate order, but it didn't work. Interestingly too, it adds an extra game to the number you assign it: So specifying 10 returns 11 games. – Mohamad Mar 08 '10 at 14:33
  • Sorry, looks like you need to change <= 10 to < 10. Not sure why you're getting older entries. – Jordan Reiter Mar 09 '10 at 01:23
1

HTH http://blog.aharbick.com/2006/09/mysql-groupwise-limiting.html

genericHCU
  • 4,394
  • 2
  • 22
  • 34
  • Travis, this is assuming a none normalized data structure. Would this work with multiple joins? – Mohamad Mar 08 '10 at 14:37