Mysql to select rows group by with order by another column

Question

I am trying to select the rows from a table by 'group by' and ignoring the first row got by sorting the data by date. The sorting should be done by a date field, to ignore the newest entry and returning the old ones for the group.

The table looks like

+----+------------+-------------+-----------+
| id | updated on | group_name  | list_name |
+----+------------+----------------+--------+
| 1  | 2013-04-03 | g1          | l1        |
| 2  | 2013-03-21 | g2          | l1        |
| 3  | 2013-02-26 | g2          | l1        |
| 4  | 2013-02-21 | g1          | l1        |
| 5  | 2013-02-20 | g1          | l1        |
| 6  | 2013-01-09 | g2          | l2        |
| 7  | 2013-01-10 | g2          | l2        |
| 8  | 2012-12-11 | g1          | l1        |
+----+------------+-------------+-----------+

http://www.sqlfiddle.com/#!2/cec99/1

So, basically, I just want to return ids (3,4,5,6,8) as those are the oldest in the group_name and list_name. Ignoring the latest entry and returning the old ones by grouping it based on group_name and list_name

I am not able to write sql for this problem. I know order by will not work with group by. Please help me in figuring out a solution.

Thanks

And also, is there a way to do this without using subqueries?

Please, take the time to put together a coherent request for help. I have absolutely no idea what you want to accomplish. — crush, May 30 '13 at 12:52
Can you please explain it further ? What do you mean by " based on group_name and list_name" ? Do you want to fetch a single(old) row from each group. ? — Vivek Sadh, May 30 '13 at 12:53
@Vivek, yes, but I need all the old entries made for group_name ignoring the id 1 for group_name g1, as thats the newest entry made — user2436575, May 30 '13 at 12:57

score 2 · Accepted Answer · edited May 23 '17 at 11:57

2

Something like the following to get only the rows that are the minimum date for a specific row:

select a.ID, a.updated_on, a.group_name, list_name
from data a 
where
a.updated_on < 
(
select max(updated_on)
from data 
group by group_name having group_name = a.group_name
);

SQL Fiddle: http://www.sqlfiddle.com/#!2/00d43/10

Update (based on your reqs)

select a.ID, a.updated_on, a.group_name, list_name
from data a 
where
a.updated_on < 
(
select max(updated_on)
from data 
group by group_name, list_name having group_name = a.group_name
  and list_name = a.list_name
);

See: http://www.sqlfiddle.com/#!2/cec99/3

Update (To not use Correlated Subquery but Simple subquery)

Decided correlated subquery is too slow based on: Subqueries vs joins

So I changed to joining with a aliased temporary table based on nested query.

select a.ID, a.updated_on, a.group_name, a.list_name
from data a,
(
select group_name, list_name , max(updated_on) as MAX_DATE
from data 
group by group_name, list_name 
) as MAXDATE   
where
a.list_name = MAXDATE.list_name AND
a.group_name = MAXDATE.group_name AND
a.updated_on < MAXDATE.MAX_DATE
;

SQL Fiddle: http://www.sqlfiddle.com/#!2/5df64/8

edited May 23 '17 at 11:57

Community

1
1

answered May 30 '13 at 13:05

Menelaos

23,508
18
90
155

But I need all the entries of the group except the newest one – user2436575 May 30 '13 at 13:06
1

Oh ok...just a small edit is needed to play with the max instead :) There you go... smaller than the max aka newest entry... – Menelaos May 30 '13 at 13:07
Updated sqlfiddle with new data. Thanks for your answer. But its not what i expected. the return set should be (3,4,5,6,8). As id-7 is the newest in the group g2 & l2 – user2436575 May 30 '13 at 13:14
@user2436575 give me your link for the SQL Fiddle... I thought you wanted to ignore the newest entry PER group_name... you want to ignore the newest entry overall? – Menelaos May 30 '13 at 13:18
http://www.sqlfiddle.com/#!2/cec99/1 forgot to post link. I want to ignore the newest entry made for both group_name and list_name. should return only (3,4,5,6,8) in this case – user2436575 May 30 '13 at 13:24
See: http://www.sqlfiddle.com/#!2/cec99/3 . Please update your question to reflect your reqs better. – Menelaos May 30 '13 at 13:30
Ok thanks. is there a way to do this without using sub queries? because this table has 10 million records – user2436575 May 30 '13 at 13:33
@user2436575, could you run both queries on your data and post the results? – Menelaos May 31 '13 at 11:03

David · Answer 2 · 2013-05-31T06:58:03.490

0

You could try using the following query (yes, it has a nested join, but maybe it helps).

SELECT ID FROM 
(select d1.ID FROM data d1 LEFT JOIN 
data d2 ON (d1.group_name = d2.group_name AND d1.list_name=d2.list_name AND 
d1.updated_on > d2.updated_on) WHERE d2.ID IS NULL) data_tmp;

CORRECTION:

SELECT DISTINCT(ID) FROM 
(select d1.* FROM data d1 LEFT JOIN 
data d2 ON (d1.group_name = d2.group_name AND d1.list_name=d2.list_name AND 
d1.updated_on < d2.updated_on) WHERE d2.ID IS NOT NULL) date_tmp;

edited May 31 '13 at 06:58

answered May 30 '13 at 13:23

David

147
4

Did you check what results it returns? According to: http://www.sqlfiddle.com/#!2/cec99/10 : {3,6,8} – Menelaos May 30 '13 at 15:25
I ran your query over 10,000 elements... there is a performance issue related to the unrary join.. 10,000 X 10,000 = 100,000,000 rows before where filter... :( – Menelaos May 31 '13 at 11:03

score 0 · Answer 3 · answered Jun 04 '13 at 21:12

0

SELECT DISTINCT y.id 
  FROM data x 
  JOIN data y 
    ON y.group_name = x.group_name 
   AND y.list_name = x.list_name 
   AND y.updated_on < x.updated_on;

answered Jun 04 '13 at 21:12

Strawberry

33,750
13
40
57

Mysql to select rows group by with order by another column

3 Answers3

Update (based on your reqs)

Update (To not use Correlated Subquery but Simple subquery)