7

I have this table

CREATE TABLE `codes` (
 `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
 `language_id` int(11) unsigned NOT NULL,
 `title` varchar(60) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
 `time_posted` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
 PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8

language_id refers to what language the record is in. What I would like to do is retrieve a list of the five most recent (ORDER BY time_posted DESC LIMIT 5) records in each language_id. I could do this in a loop within PHP with a number of different SQL queries but I feel there is a simpler way.

I've got to get a book on SQL, haha.

Thanks.

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
James B
  • 219
  • 1
  • 3
  • 6
  • 1
    What SQL engine? That's unfortunately crucial -- the SQL Standard, PostgreSQL, MS SQL Server, Oracle, IBM DB2, etc, etc, have one superb way to do exactly what you want -- but if you're stuck with MySQL, that perfect solution working on every good relational DB and according to the standard itself is not available, so it's fudge-and-kludge time (par for the course for MySQL -- sigh). So what is it gonna be -- every decent SQL implementation on the planet, on one side, or MySQL, on the other...? – Alex Martelli Aug 30 '09 at 04:49
  • @Alex: Quit beating around the bush -- let us know how you feel about MySQL! Don't keep it bottled up. ;-) – Bill Karwin Aug 30 '09 at 04:51

2 Answers2

9

Here's how I solve this "top N per group" type of query in MySQL:

SELECT c1.*
FROM codes c1
LEFT OUTER JOIN codes c2
  ON (c1.language_id = c2.language_id AND c1.time_posted < c2.time_posted)
GROUP BY c1.id
HAVING COUNT(*) < 5;

See also "How do I select multiple items from each group in a mysql query?"

Community
  • 1
  • 1
Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • Sorry for being daft, but where did the C1 and C2 come from? – James B Aug 30 '09 at 05:08
  • Note that if `time_posted` can have ties you might get some interesting results. Let me know if this is a concern because it is solvable too. – Bill Karwin Aug 30 '09 at 05:20
  • @BillKarwin, this works for me (I have a similar use case), but I don't understand why I'd have to write "HAVING COUNT(\*) < 5" instead of "HAVING COUNT(\*) <= 5" to get a maximum of 5 entries for each group – Alberto Schiabel Jan 22 '18 at 15:35
  • @AlbertoSchiabel: If you want the top 5 per language, then by definition there for a row `c1` to pass the condition, there must be *fewer* than 5 other rows with a greater timestamp. Try it yourself: http://sqlfiddle.com/#!9/5a34d6/1/0 – Bill Karwin Jan 22 '18 at 16:14
1

Here is a great solution I just found.

Select the TOP n Rows For Each Group Arnie Rowland, March 13, 2008

There are Multiple Rows for Each Category, and there is a desire to SELECT ONLY the TOP two (2) Rows per Category by Price. For example, from the following data:

RowID    Category    ID  Description     Price
1        Pot         A1  Small Saucepan  21.50
2        Pot         A2  1 Qt Saucepan   29.95
3        Pot         A3  1.5 Qt Saucepan 33.95
4        Pot         A4  Double Boiler   39.50
5        Pot         A5  Stewpot         49.50
6        Pot         A6  Pressure Cooker 79.95
7        Pan         B1  8" Pie          6.95
8        Pan         B2  8" Sq Cake      7.50
9        Pan         B3  Bundt Cake      12.50
10       Pan         B4  9x12 Brownie    7.95
11       Bowl        C1  Lg Mixing       27.50
12       Bowl        C2  Sm Mixing       17.50
13       Tools       T1  14" Spatula     9.95

The desired output is:

RowID    Category    ID  Description     Price
11       Bowl        C1  Lg Mixing       27.50
12       Bowl        C2  Sm Mixing       17.50
9        Pan         B3  Bundt Cake      12.50
10       Pan         B4  9x12 Brownie    7.95
6        Pot         A6  Pressure Cooker 79.95
5        Pot         A5  Stewpot         49.50
13       Tools       T1  14" Spatula     9.95

There are several methods to accomplish the desired output. This demonstration provides a Solution for SQL Server 2005 / SQL Server 2008 , and then a Solution for SQL Server 2000.

Create Sample Data for Both Solutions

-- Suppress data loading messages
SET NOCOUNT ON

-- Create Sample Data using a Table Variable
DECLARE @MyTable table
   (  RowID         int   IDENTITY, 
      Category      varchar(5),
      [ID]          varchar(5),
      [Description] varchar(25),
      Price         decimal(10,2)
   )

-- Load Sample Data

INSERT INTO @MyTable VALUES ( 'Pot', 'A1', 'Small Saucepan', 21.50 )
INSERT INTO @MyTable VALUES ( 'Pot', 'A2', '1 Qt Saucepan', 29.95 )
INSERT INTO @MyTable VALUES ( 'Pot', 'A3', '1.5 Qt Saucepan', 33.95 )
INSERT INTO @MyTable VALUES ( 'Pot', 'A4', 'Double Boiler', 39.50 )
INSERT INTO @MyTable VALUES ( 'Pot', 'A5', 'Stewpot', 49.50 )
INSERT INTO @MyTable VALUES ( 'Pot', 'A6', 'Pressure Cooker', 79.95 )
INSERT INTO @MyTable VALUES ( 'Pan', 'B1', '8"" Pie', 6.95 )
INSERT INTO @MyTable VALUES ( 'Pan', 'B2', '8"" Sq Cake', 7.50 )
INSERT INTO @MyTable VALUES ( 'Pan', 'B3', 'Bundt Cake', 12.50 )
INSERT INTO @MyTable VALUES ( 'Pan', 'B4', '9x12 Brownie', 7.95 )
INSERT INTO @MyTable VALUES ( 'Bowl', 'C1', 'Lg Mixing', 27.50 )
INSERT INTO @MyTable VALUES ( 'Bowl', 'C2', 'Sm Mixing', 17.50 )
INSERT INTO @MyTable VALUES ( 'Tools', 'T1', '14"" Spatula', 9.95 )
Return to Top

SQL Server 2005 / SQL Server 2008 Solution

--Query to Retrieve Desired Data
SELECT
   RowID,
   Category,
   [ID],
   [Description],
   Price
FROM (SELECT
         ROW_NUMBER() OVER ( PARTITION BY Category ORDER BY Price DESC ) AS 'RowNumber',
         RowID,
         Category,
         [ID],
         [Description],
         Price
      FROM @MyTable
      ) dt
WHERE RowNumber <= 2

-- Results
RowID Category  ID Description     Price
11    Bowl      C1 Lg Mixing       27.50
12    Bowl      C2 Sm Mixing       17.50
9     Pan       B3 Bundt Cake      12.50
10    Pan       B4 9x12 Brownie    7.95
6     Pot       A6 Pressure Cooker 79.95
5     Pot       A5 Stewpot         49.50
13    Tools     T1 14" Spatula     9.95
Return to Top

SQL Server 2005 / SQL Server 2008 Solution using a CTE (Added by: Jacob Sebastian)

-- Define a CTE with the name "dt" 
;WITH dt AS (
     SELECT
         ROW_NUMBER() OVER ( PARTITION BY Category ORDER BY Price DESC ) AS 'RowNumber',
         RowID,
         Category,
         [ID],
         [Description],
         Price
      FROM @MyTable
)
-- and select the data from the CTE
SELECT
   RowID,
   Category,
   [ID],
   [Description],
   Price
FROM dt
WHERE RowNumber <= 2

-- Results
RowID Category  ID Description     Price
11    Bowl      C1 Lg Mixing       27.50
12    Bowl      C2 Sm Mixing       17.50
9     Pan       B3 Bundt Cake      12.50
10    Pan       B4 9x12 Brownie    7.95
6     Pot       A6 Pressure Cooker 79.95
5     Pot       A5 Stewpot         49.50
13    Tools     T1 14" Spatula     9.95
Return to Top

SQL 2000 Solution

--Query to Retrieve Desired Data
SELECT DISTINCT
   RowID,
   Category,
   [ID],
   [Description],
   Price
FROM @MyTable t1
WHERE RowID IN (SELECT TOP 2
                   RowID
                FROM @MyTable t2
                WHERE t2.Category = t1.Category
                ORDER BY Price DESC
               )
ORDER BY 
   Category,
   Price DESC

-- Results
RowID Category  ID Description     Price
11    Bowl      C1 Lg Mixing       27.50
12    Bowl      C2 Sm Mixing       17.50
9     Pan       B3 Bundt Cake      12.50
10    Pan       B4 9x12 Brownie    7.95
6     Pot       A6 Pressure Cooker 79.95
5     Pot       A5 Stewpot         49.50
13    Tools     T1 14" Spatula     9.95

From: Select the TOP n Rows For Each Group

C B
  • 1,677
  • 6
  • 18
  • 20
Mike
  • 1,302
  • 6
  • 23
  • 43
  • Note that link-only answers are discouraged, SO answers should be the end-point of a search for a solution (vs. yet another stopover of references, which tend to get stale over time). Please consider adding a stand-alone synopsis here, keeping the link as a reference. – kleopatra Jul 27 '13 at 09:03