5

For instance I have the following table:

id group data
1 1 aaa
2 1 aaa
3 2 aaa
4 2 aaa
5 2 aaa
6 3 aaa
7 3 aaa
8 3 aaa

What is the best way to select the first two records of each group by a "SELECT" command? If there is no good way to do so, what routine do you suggest?(in PHP)

(model outcome)

1 1 aaa
2 1 aaa
3 2 aaa
4 2 aaa
6 3 aaa
7 3 aaa

I knew that cross-joining by a.id >= b.id in a sub-query can be working but I am looking for a more scalable solution that can be applied on a table with millions of records. Thanks

Abby Chau Yu Hoi
  • 1,378
  • 3
  • 15
  • 37

3 Answers3

8
select a.*
from Tablename a
where 
(
   select count(*) 
   from Tablename as b
   where a.group = b.group and a.id >= b.id
) <= 2
John Woo
  • 258,903
  • 69
  • 498
  • 492
  • That is one solution suggested from my friend by using `a.id <= b.id` for cross joining, but I foresaw a big performance issue that the indeterminate section of `count(*)` is growing in the index power when the table is growing large. Am I right? I know my expression may be a bit confusing though. – Abby Chau Yu Hoi Apr 09 '13 at 06:39
  • it is working good, but it not giving the first two records, it is giving last two records, so better check it.. – Deepanshu Goyal Apr 09 '13 at 06:41
  • 1
    @Deepanshu it can simply be change into `a.id >= b.id` here http://www.sqlfiddle.com/#!2/1b596/3 – John Woo Apr 09 '13 at 06:42
  • thats great.. actually I just wanted to point that out to Abby – Deepanshu Goyal Apr 09 '13 at 06:46
  • @JW : Your query don't give what OP want. Check here http://sqlfiddle.com/#!2/1b596/4 OP want id as 3,4 and not 4,5 – Fahim Parkar Apr 09 '13 at 06:51
  • @FahimParkar read what i said on Deepanshu. it can simply be change into `a.id >= b.id` here http://www.sqlfiddle.com/#!2/1b596/3 – John Woo Apr 09 '13 at 06:52
3

I like this trick, that makes use of GROUP_CONCAT aggregate function, and FIND_IN_SET:

SELECT
  Tablename.*
FROM
  Tablename INNER JOIN (
    SELECT `group`, GROUP_CONCAT(id ORDER BY id) ids
    FROM Tablename
    GROUP BY `group`) grp ON
  Tablename.`group` = grp.`group` AND
  FIND_IN_SET(Tablename.id, ids)<=2
ORDER BY
  Tablename.`group`, Tablename.id

Performances can't be too good, as it can't make use of an index.

Or you can also use this:

SELECT t1.id, t1.`group`, t1.data
from
  Tablename t1 INNER JOIN Tablename t2
  ON t1.`group` = t2.`group` AND t1.id>=t2.id
GROUP BY
  t1.id, t1.`group`, t1.data
HAVING
  COUNT(*)<=2
ORDER BY
  t1.`group`, t1.id, t1.data
fthiella
  • 48,073
  • 15
  • 90
  • 106
-1

You select, filter and order your query like normal and then

for MSSQL

SELECT TOP 2 * FROM foo; 

From what I can remember Sybase, Oracle and possible a few other RDBMS's uses this syntax to.

for MySQL you do

SELECT * FROM foo LIMIT 2; 

Update:

Yes, I misread your question, sorry. Seems like a few of us did :)

Then it depends on whether you RDBMS supports HAVING or not etc. You could construct a query using HAVING or using IN and a subquery in the IN clause.

For MSSQL I think you could do something like (code not tested)

SELECT id, data
    FROM (
        SELECT id, data, Rank() over (Partition BY group ORDER BY id DESC ) AS Rank
        FROM table
        ) rs WHERE Rank <= 2)

But since this depends on your RDBMS I ask you to look at similar questions and see which one works best for your case since MSSQL supports some things MySQL doesn't and the other way around.

Here are some examples

Select top 10 records for each category

How to select the last two records for each topic_id in MySQL

Community
  • 1
  • 1
inquam
  • 12,664
  • 15
  • 61
  • 101