31

If I have a USER table like

class | age
--------------
1       20    
3       56
2       11
1       12
2       20

Then I can easily get the youngest user in each class via

select class, min(age)
from   user
group by class;

Similarly, by replacing min with max, I can get the oldest. But how can I get the 10th youngest (or oldest) in each class? By the way, I'm using MySql v.5.0.

Cheers,

Dónal
  • 185,044
  • 174
  • 569
  • 824

7 Answers7

25
SELECT a.class,
(
    SELECT b.age 
    FROM users b 
    WHERE b.class = a.class
    ORDER BY age 
    LIMIT 1,1
) as age
FROM users a
GROUP BY a.class

Would get the 2nd youngest in each class. If you wanted the 10th youngest, you'd do LIMIT 9,1 and if you wanted the 10th oldest, you'd do ORDER BY age DESC.

Paolo Bergantino
  • 480,997
  • 81
  • 517
  • 436
  • 1
    That will give me the 10th oldest overall, not the 10th oldest in each class. – Dónal Jan 20 '09 at 21:13
  • That doesn't seem to work for MySQL v5.5: ERROR 1235 (42000): This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery' – Marcus Dec 11 '13 at 00:17
13

Here N presents Nth record oldest

SELECT *
FROM users k
WHERE N = (SELECT
             COUNT( DISTINCT age)
           FROM users u
           WHERE k.age >= u.age
               AND k.class = u.class
           GROUP BY u.class)

and it gives Nth record youngest

SELECT *
FROM users k
WHERE N = (SELECT
             COUNT(DISTINCT age)
           FROM users u
           WHERE k.age <= u.age
               AND k.class = u.class
           GROUP BY u.class)
Deval Shah
  • 1,094
  • 8
  • 22
5

The only sql independent way (even if you don't have subqueries mysql <5)

 select  u1.class, u1.age, count(*)  from      user u1 join user u2 
 on u1.class = u2.class and u1.age >= u2.age
 group by u1.class, u1.age
 having count(*) = [number]

gets you the [number] oldest per class

 select  u1.class, u1.age, count(*)  from      user u1 join user u2 
 on u1.class = u2.class and u1.age <= u2.age
 group by u1.class, u1.age
 having count(*) = [number]

gets you the [number] youngest per class

If two people have the same age it may not work as both are returned. If you want to only return one of them you will need a unique key and the query is more complicated.

David Raznick
  • 17,907
  • 2
  • 35
  • 27
1

Any answer that joins a table on it self will create a square law...

- a JOIN b ON a.class = b.class AND a.age >= b.age  
- on average the >= condition will be true for half the class  

- 6 people in a class  
->6*6/2 = 18

- 10 people in a class
->10*10/2 = 50

-> very rapid growth

As the table sizes grow the performance will degrade rapidly. If you're keeping things small and they won't grow much, is it an issue? Your call there...

An alternative involves more code, but grows linearly...

  • First, insert all the records intoa new table, with an IDENTITY field, ordered by Class then Age
  • Now, for each class, find the MIN(id)
  • Now, for each class, rinf the record where is = MIN(id) + 8 (for the 9th eldest)

There are a lot of ways of doing the last 2 steps. I personally would use...

SELECT
    [USER_WITH_IDS].id,
    [USER_WITH_IDS].class,
    [USER_WITH_IDS].age
FROM
    [USER_WITH_IDS]
WHERE
    [USER_WITH_IDS].id = (
                          SELECT
                              MIN([min].ID) + 8
                          FROM
                              [USER_WITH_IDS] AS [min]
                          WHERE
                              [min].class = [USER_WITH_IDS].class
                         )

What this gives is...

  • One pass to create the new IDs
  • One pass to get the MIN(id) for each class
  • One pass to get the records you need

  • And depending on how good the optimiser is, using an index (class then id) would allow it to combine the last 2 passes into 1 pass.

2 or 3 passes, no matter how big the table or class size gets. Linear, not square law...

MatBailie
  • 83,401
  • 18
  • 103
  • 137
1

Another way is to place the SQL query inside a for loop, each time adding the new minimum values to a growing tuple, that are filtered out with a WHERE ... NOT IN clause. I'm not sure about the speed of this, but for me it worked real quick for a very large dataset (>180,000 records). Sorry about mixing SQL and Python, I'm a newbie to both. I know there is probably a way to do the loop in SQL, but I don't know it.

        idx = tuple()
        for i in range(n):

            qry = '''SELECT class, min(age)
                     FROM user
                     WHERE age NOT IN {}
                     GROUP BY class'''.format(idx)

            cur.execute(qry)
            data = cur.fetchall()
            d = pd.DataFrame(data)

            idx = idx + tuple(d['min(age)'])
mdeaves
  • 11
  • 1
0

It's pretty easy in SQL Server:

select 
  *
from(
   select 
      *,
      row_number() over(order by age asc) as eldest
   from class order by age asc) a
where a.eldest = 10

Following that pattern, for MySQL, I think you'd want to look at this: http://www.xaprb.com/blog/2006/12/02/how-to-number-rows-in-mysql/

Tom Ritter
  • 99,986
  • 30
  • 138
  • 174
  • The method in the link DOES work but it is a bit of a hack. It is also very quick. For one offs fine, for production no! – David Raznick Jan 20 '09 at 21:46
0
 SELECT 
     userid,  
     class,  
     age,  
     (    SELECT COUNT(1) FROM user  
          WHERE class = c.class  AND age > u.age  
     ) AS oldercount  
FROM user AS u  
WHERE oldercount = 9
GROUP BY class

or

 SELECT userid,  
         class,  
         age  
  FROM user AS u    
  WHERE (SELECT COUNT(1) FROM class WHERE class = c.class AND age > u.age) = 9  
GROUP BY class
godot
  • 3,422
  • 6
  • 25
  • 42
dkretz
  • 37,399
  • 13
  • 80
  • 138