Group by minimum value in one field while selecting distinct rows

Question

Here's what I'm trying to do. Let's say I have this table t:

key_id | id | record_date | other_cols
1      | 18 | 2011-04-03  | x
2      | 18 | 2012-05-19  | y
3      | 18 | 2012-08-09  | z
4      | 19 | 2009-06-01  | a
5      | 19 | 2011-04-03  | b
6      | 19 | 2011-10-25  | c
7      | 19 | 2012-08-09  | d

For each id, I want to select the row containing the minimum record_date. So I'd get:

key_id | id | record_date | other_cols
1      | 18 | 2011-04-03  | x
4      | 19 | 2009-06-01  | a

The only solutions I've seen to this problem assume that all record_date entries are distinct, but that is not this case in my data. Using a subquery and an inner join with two conditions would give me duplicate rows for some ids, which I don't want:

key_id | id | record_date | other_cols
1      | 18 | 2011-04-03  | x
5      | 19 | 2011-04-03  | b
4      | 19 | 2009-06-01  | a

If there is a `min_by` function, consider it for this purpose. It saved me from writing something more complicated. — Asclepius, Aug 04 '21 at 19:32

score 134 · Accepted Answer · edited Mar 03 '20 at 16:51

134

How about something like:

SELECT mt.*     
FROM MyTable mt INNER JOIN
    (
        SELECT id, MIN(record_date) AS MinDate
        FROM MyTable
        GROUP BY id
    ) t ON mt.id = t.id AND mt.record_date = t.MinDate

This gets the minimum date per ID, and then gets the values based on those values. The only time you would have duplicates is if there are duplicate minimum record_dates for the same ID.

edited Mar 03 '20 at 16:51

Alberto Moro

1,014
11
22

answered Sep 10 '13 at 17:25

Adriaan Stander

162,879
31
289
284

1

Ah, initially I was using an expression to output a date which was causing the 'and' condition on the inner join to not work properly. Changed it to an actual column and it works now (and had to modify some other things as a result), thanks! – user2765924 Sep 10 '13 at 17:41
9

this would not work when two records of same id and date are present, will get you multiple rows right? – rajat Feb 24 '16 at 10:32

score 17 · Answer 2 · edited Mar 08 '17 at 17:49

17

I could get to your expected result just by doing this in mysql:

 SELECT id, min(record_date), other_cols 
  FROM mytable
  GROUP BY id

Does this work for you?

edited Mar 08 '17 at 17:49

Gajus

69,002
70
275
438

answered Sep 10 '13 at 17:33

Math

3,334
4
36
51

4

For whatever reason, this appears to work in the contrived example (http://sqlfiddle.com/#!2/f8469/6/0), but in practice I get "Column 'database.table.col_name' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause." I was able to get it working with astander's answer anyway, thanks. – user2765924 Sep 10 '13 at 17:45
1

Yeah I'm running into the same issue, I'd like a simple answer like this one on SQL Server – Pedro Braz Oct 23 '15 at 13:54
Newer MySQL versions will raise an error here. (Unless in compatibility mode. See https://dev.mysql.com/doc/refman/5.7/en/sql-mode.html#sqlmode_only_full_group_by.) – jarlh Feb 07 '22 at 09:52
The reason this works or worked in older versions of MySQL is that MySQL would just give you arbitrarily the first values it encounters (maybe based on insertion order or whatever) for the non-grouped-by columns. This is really arbitrary and this is why other DBMSes don't do this, and apparently MySQL also chose to do the right thing. – wearego Feb 14 '22 at 09:04

DIMI · Answer 3 · 2016-02-24T23:52:37.030

To get the cheapest product in each category, you use the MIN() function in a correlated subquery as follows:

    SELECT categoryid,
       productid,
       productName,
       unitprice 
    FROM products a WHERE unitprice = (
                SELECT MIN(unitprice)
                FROM products b
                WHERE b.categoryid = a.categoryid)

The outer query scans all rows in the products table and returns the products that have unit prices match with the lowest price in each category returned by the correlated subquery.

score 4 · Answer 4 · answered Jun 06 '18 at 14:26

I would like to add to some of the other answers here, if you don't need the first item but say the second number for example you can use rownumber in a subquery and base your result set off of that.

SELECT * FROM
(
    SELECT
        ROW_NUM() OVER (PARTITION BY Id ORDER BY record_date, other_cols) as rownum,
        *
    FROM products P
) INNER
WHERE rownum = 2

This also allows you to order off multiple columns in the subquery which may help if two record_dates have identical values. You can also partition off of multiple columns if needed by delimiting them with a comma

score 3 · Answer 5 · edited Nov 08 '16 at 18:35

3

This does it simply:

select t2.id,t2.record_date,t2.other_cols 
from (select ROW_NUMBER() over(partition by id order by record_date)as rownum,id,record_date,other_cols from MyTable)t2 
where t2.rownum = 1

edited Nov 08 '16 at 18:35

EJoshuaS - Stand with Ukraine

11,977
56
49
78

answered Nov 08 '16 at 17:21

jeff

31
2

epox · Answer 6 · 2020-06-22T11:17:32.923

If `record_date` has no duplicates within a group:

think of it as of filtering. Simpliy get (WHERE) one (MIN(record_date)) row from the current group:

SELECT * FROM t t1 WHERE record_date = (
                                 select MIN(record_date)
                                 from t t2 where t2.group_id = t1.group_id)

If there could be 2+ min `record_date` within a group:

filter out non-min rows (see above)

then (AND) pick only one from the 2+ min record_date rows, within the given group_id. E.g. pick the one with the min unique key:

                AND key_id = (select MIN(key_id)
                              from t t3 where t3.record_date = t1.record_date
                                          and t3.group_id    = t1.group_id)

so

key_id | group_id | record_date | other_cols
1      | 18       | 2011-04-03  | x
4      | 19       | 2009-06-01  | a
8      | 19       | 2009-06-01  | e

will select key_ids: #1 and #4

Roberto Savulescu · Answer 7 · 2021-08-03T07:08:26.063

1

SELECT p.* FROM tbl p
INNER JOIN(
  SELECT t.id, MIN(record_date) AS MinDate
  FROM tbl t
  GROUP BY t.id
) t ON p.id = t.id AND p.record_date = t.MinDate
GROUP BY p.id

This code eliminates duplicate record_date in case there are same ids with same record_date.
If you want duplicates, remove the last line GROUP BY p.id.

edited Aug 03 '21 at 07:08

answered Aug 03 '21 at 07:02

Roberto Savulescu

41
5

score 0 · Answer 8 · answered May 05 '16 at 21:03

This a old question, but this can useful for someone In my case i can't using a sub query because i have a big query and i need using min() on my result, if i use sub query the db need reexecute my big query. i'm using Mysql

select t.* 
    from (select m.*, @g := 0
        from MyTable m --here i have a big query
        order by id, record_date) t
    where (1 = case when @g = 0 or @g <> id then 1 else  0 end )
          and (@g := id) IS NOT NULL

Basically I ordered the result and then put a variable in order to get only the first record in each group.

score 0 · Answer 9 · answered Apr 30 '18 at 23:26

0

The below query takes the first date for each work order (in a table of showing all status changes):

SELECT
    WORKORDERNUM,
    MIN(DATE)
FROM
    WORKORDERS
WHERE
    DATE >= to_date('2015-01-01','YYYY-MM-DD')
GROUP BY
    WORKORDERNUM

answered Apr 30 '18 at 23:26

SRVFan

334
1
6
16

score 0 · Answer 10 · edited Sep 09 '20 at 01:50

0

select 
    department, 
    min_salary, 
    (select s1.last_name from staff s1 where s1.salary=s3.min_salary ) lastname 
from 
    (select department, min (salary) min_salary from staff s2 group by s2.department) s3

edited Sep 09 '20 at 01:50

Dijkgraaf

11,049
17
42
54

answered Sep 09 '20 at 00:33

user2867940

1

3

Welcome to Stack Overflow. Code-only answers are discouraged on Stack Overflow because they don't explain how it solves the problem. Please edit your answer to explain what this code does and how it improves of the existing answers this question already has, so that it is useful to other users with similar issues. – FluffyKitten Sep 09 '20 at 02:38

Group by minimum value in one field while selecting distinct rows

10 Answers10

If `record_date` has no duplicates within a group:

If there could be 2+ min `record_date` within a group:

Linked

Related

Group by minimum value in one field while selecting distinct rows

10 Answers10

If record_date has no duplicates within a group:

If there could be 2+ min record_date within a group:

Linked

Related

If `record_date` has no duplicates within a group:

If there could be 2+ min `record_date` within a group: