5

I want to limit the size of records inside a group, and here is my trial, how to do it right?

mysql> select * from accounts limit 5 group by type;

ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'group by type' at line 1

Jocelyn
  • 11,209
  • 10
  • 43
  • 60
omg
  • 136,412
  • 142
  • 288
  • 348

8 Answers8

5

The point of an aggregate function (and the GROUP BY it requires) is to turn many rows into one row. So if you really just want the top 5 savings accounts and the top 5 chequing accounts and the top 5 USD accounts etc., what you need is more like this:

criteria: top 5 of particular account type by account_balance

SELECT account_type, account_balance FROM accounts WHERE account_type='savings' 
   ORDER BY account_balance DESC LIMIT 5
UNION
SELECT account_type, account_balance FROM accounts WHERE account_type='chequing' 
   ORDER BY account_balance DESC LIMIT 5
UNION
SELECT account_type, account_balance FROM accounts WHERE account_type='USD' 
   ORDER BY account_balance DESC LIMIT 5;

It's not pretty, but if you construct the SQL with a script then subbing in the account_types and concatenating together a query is straightforward.

dnagirl
  • 20,196
  • 13
  • 80
  • 123
5

I've had some luck with using numbered rows:

set @type = '';
set @num = 0;

select
  items.*,
  @num := if(@type = item_type, @num + 1, 1) as dummy_1,
  @type := item_type as dummy_2,
  @num as row_number
from items
group by
  item_type,
  row_number
having row_number < 3;

This will give you 2 results per item_type. (One gotcha: make sure you re-run the first two set statements otherwise your row numbers will steadily get higher and higher and the row_number < 3 restriction won't work.

I pieced this together from a couple of posts which have been linked in other answers on SO.

Community
  • 1
  • 1
Adam Prescott
  • 1,145
  • 2
  • 11
  • 17
  • 1
    This is the best answer. One thing you can do is to use something like this: `from items, (SELECT @type:='', @num:=0) v` and then you can set the variables inside the join, instead of in a separate statement. That can be helpful in environments that don't allow multiple statements. – Chris Strickland Mar 25 '15 at 05:36
1

Try placing the LIMIT clause after the GROUP BY clause.

EDIT: Try this:

SELECT * 
FROM accounts a1
WHERE 5 > 
(
   SELECT COUNT(*)
   FROM accounts a2
   WHERE a2.type = a1.type
   AND a2.balance > a1.balance
)

This returns at most 5 accounts of each type with the biggest balances.

Ken Keenan
  • 9,818
  • 5
  • 32
  • 49
1

It appears you want to limit the number of rows returned within each group of your overall result set... this is difficult to do in a way that scales well. One technique is to perform N joins on the same table with the conditions such that the only rows that match are the top/bottom N that you want.

this page may offer some additional insight into your solution... although returning the top 5 in each group is going to get ugly fast.

Ty W
  • 6,694
  • 4
  • 28
  • 36
0

Group by is used for aggregate functions (sums, averages...)

Is allows you to split the aggregate result into groups. You have not used one of these functions.

John Barrett
  • 545
  • 4
  • 9
0

I am not sure you can use a limit in the group by. You can probably use it if your group by is a sub select that returns one row/value. For example:

select * from foo order by (select foo2.id from foo2 limit 1)

I am just guessing this would work.

northpole
  • 10,244
  • 7
  • 35
  • 58
0

This will probably do the trick, although if type isn't indexed, it'll be sloooowwww. And even with one, it's not especially fast:

SELECT a.*
FROM accounts a
     LEFT JOIN accounts a2 ON (a2.type = a.type AND a2.id < a.id)
WHERE count(a2.id) < 5
GROUP BY a.id;

A better bet would be to just order the list by type and then use a loop at the business layer to remove the rows you don't want.

Plutor
  • 2,867
  • 2
  • 25
  • 29
0

@dnagirl's answer almost has it, but for some reason, my version of MySQL only returns the first LIMIT'd set. To get around that, I put each statement into a subquery

SELECT * FROM (
    SELECT account_type, account_balance FROM accounts WHERE account_type='savings' 
       ORDER BY account_balance DESC LIMIT 5
) as a
UNION
SELECT* FROM (
    SELECT account_type, account_balance FROM accounts WHERE account_type='chequing' 
       ORDER BY account_balance DESC LIMIT 5
) as b
UNION
SELECT * FROM (
    SELECT account_type, account_balance FROM accounts WHERE account_type='USD' 
       ORDER BY account_balance DESC LIMIT 5
) as c

This gave me back each set's results in the final result set. Otherwise, I would have only gotten the first 5 from the first query and nothing else - not sure if it's just some MySQL funk with my version

phatskat
  • 1,797
  • 1
  • 15
  • 32