How to select first row for each set of rows with common value

Question

As an example I have a a table like this:

week_dt     cust    y_w     w   y       y_fp
2011-01-29  ABC     201122  6   2011    201106
2011-02-05  ABC     201123  6   2011    201106
2011-02-12  ABC     201124  6   2011    201106
2011-02-19  ABC     201125  7   2011    201107
2011-02-26  ABC     201126  7   2011    201107
2011-03-05  ABC     201127  7   2011    201107
2011-03-12  ABC     201128  7   2011    201107
2011-03-19  ABC     201129  8   2011    201108
2011-03-26  ABC     201130  8   2011    201108
2011-04-02  ABC     201131  8   2011    201108
2011-04-09  ABC     201132  8   2011    201108
2011-04-16  ABC     201133  9   2011    201109
2011-04-23  ABC     201134  9   2011    201109

The last column contains year/fiscal period info. I would like to select one row for each y_fp element with week_dt and y_w being the first date for that year/fiscal period. Thus selecting from the above would yield:

week_dt     cust    y_w     w   y       y_fp
2011-01-29  ABC     201122  6   2011    201106
2011-02-19  ABC     201125  7   2011    201107
2011-03-19  ABC     201129  8   2011    201108
2011-04-16  ABC     201133  9   2011    201109

I've tried following this example using TOP(1) instead of LIMIT but I got the entire table returned to me.

This was my query:

SELECT * FROM ( 
    SELECT
    md.week_dt, 
    md.cust, 
    md.y_w, 
    md.y, 
    md.w, 
    md.y_fp, 
    FROM MASTER_DATES md
) t1
Where 
    t1.week_dt IN (
    SELECT TOP(1) t2.week_dt
    FROM MASTER_DATES t2
    WHERE t2.week_dt = t1.week_dt AND t2.cust = t1.cust
    ORDER BY t2.week_dt
)
ORDER BY t1.week_dt

Note that MASTER_DATES contains a composite primary key from columns week_dt and cust.

score 1 · Accepted Answer · answered Mar 05 '19 at 19:57

Try to use row_number and a partition by to divide it by md.y_fp

WITH C AS(
 SELECT
    md.week_dt, 
    md.cust, 
    md.y_w, 
    md.y, 
    md.w, 
    md.y_fp,
    row_number() over(partition by md.y_fp order by md.week_dt asc) as rn
FROM MASTER_DATES md
)
SELECT * 
FROM C 
WHERE RN = 1

score 1 · Answer 2 · answered Mar 05 '19 at 20:02

One method is row_number():

select md.*
from (select md.*,
             row_number() over (partition by md.cust, md.y_fp order by md.week_date desc) as seqnum
      from master_dates md
     ) md
where seqnum = 1;

In some cases, though, a correlated subquery has better performance:

select md.*
from master_dates md
where md.week_date = (select min(md2.week_date)
                      from master_dates md2
                      where md2.cust = md.cust and md2.y_fp = md.y_fp
                     );

For best performance, you want an index on (cust, y_fp, week_date).

These answer assume that you want the information per customer, although that is not explicitly stated in the question.

forpas · Answer 3 · 2019-03-05T20:29:47.220

1

Your sample data and expected results indicate that a simple group by is enough:

select
  min(week_dt) week_dt, 
  cust,
  min(y_w) y_w,
  w,
  y,
  y_fp
from MASTER_DATES 
group by cust, y, w, y_fp

edited Mar 05 '19 at 20:29

answered Mar 05 '19 at 20:17

forpas

160,666
10
38
76

Sorry the duplicate y column was an accident. I'll edit that. And yes this seems to have worked as well and is very simple. – ahh_real_numbers Mar 05 '19 at 20:27
You need grouping and the minimum values of 2 columns. That's it. – forpas Mar 05 '19 at 20:28

How to select first row for each set of rows with common value

3 Answers3