Return 1 row per person

Question

I want to return 1 row per person based on the latest "APP_START_DT".

I want to make sure it looks at MAX(APP_START_DT) and then follows all dates on that row for that person and disregard the other rows and dates on it.

This is what I have tried:
select distinct ap1.ID,
MAX(ap1.APP_START_DT) over (PARTITION by ap1.ID) APP_START_DT,
ap1.APP_SUBMIT_DT
from table1 ap1
where ap1.ID in ('444','555');

Getting 2 rows for both of these ids.



My Table:
ID   NAME    Acct_CR_DT    App_Ap    APP_START_DT    APP_SUBMIT_DT
444  ABC1    9/5/2018      My univ   9/5/2018        9/14/2018
444  ABC1    9/5/2018      {null}    {null}          {null}
555  ABC2    1/9/2019      {null}    {null}          {null}
555  ABC2    1/9/2019      Univ4     1/9/2019        1/9/2019




Desired Results:   
ID     NAME   Acct_CR_DT  App_Ap   APP_START_DT  APP_SUBMIT_DT App_stat
444  ABC1     9/5/2018    My univ  9/5/2018      9/14/2018     Submt
555  ABC2     1/9/2019    Univ4    1/9/2019      1/9/2019      Submt

Possible duplicate of [Select first row in each GROUP BY group?](https://stackoverflow.com/questions/3800551/select-first-row-in-each-group-by-group) — Clockwork-Muse, Feb 05 '19 at 01:00
it's similar but not exact. I already applied that solution before and mine is a little more complicated. — user2378934, Feb 05 '19 at 01:06
....how? The data you show fits that solution exactly. Can you add sample starting data and results that would invalidate that as an answer? — Clockwork-Muse, Feb 05 '19 at 03:53

score 0 · Answer 1 · answered Feb 05 '19 at 01:04

0

Grouping by ap1.ID should allow you to get one row per ID while getting your MAX APP_START_DT

SELECT distinct ap1.ID, MAX(ap1.APP_START_DT)
FROM table1 ap1
WHERE ap1.ID in ('444','555')
GROUP BY ap1.ID ;

answered Feb 05 '19 at 01:04

Javier

39
9

That will only work in a situation if I'm trying to return "APP_START_DT" but if I want to return other dates which is different in those ids it creates 2 rows , 1 for null and other that has all the dates. I want the query to ignore the other row and only return one that has MAX(APP_START_DT) plus other dates if any vs the NULL. – user2378934 Feb 05 '19 at 01:07
Besides, DISTINCT is irrelevant when there's aggregation in a query so - remove it. – Littlefoot Feb 05 '19 at 07:15

score 0 · Answer 2 · answered Feb 05 '19 at 04:10

You can apply null handling function to get desired results.

select distinct ap1.ID,
MAX(ap1.APP_START_DT) over (PARTITION by ap1.ID) APP_START_DT,
ap1.APP_SUBMIT_DT
from (select COALESCE(APP_START_DT, 'lowest_date') as APP_START_DT, ID, APP_SUBMIT_ID from table1 where ID in ('444','555')) ap1;

score 0 · Answer 3 · answered Feb 05 '19 at 11:24

What about FIRST_VALUE / LAST_VALUE (and partitioning windows according to user ID ?):

SELECT DISTINCT
    ID, NAME,
    LAST_VALUE(Acct_CR_DT)    OVER(PARTITION BY ID ORDER BY APP_START_DT NULLS FIRST RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING),
    LAST_VALUE(App_Ap)        OVER(PARTITION BY ID ORDER BY APP_START_DT NULLS FIRST RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING),
    LAST_VALUE(APP_START_DT)  OVER(PARTITION BY ID ORDER BY APP_START_DT NULLS FIRST RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING),
    LAST_VALUE(APP_SUBMIT_DT) OVER(PARTITION BY ID ORDER BY APP_START_DT NULLS FIRST RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
  FROM table1 ap1

For each row, we find the "Last value" of each column for each user ID/Name pair. We need DISTINCT to remove duplicit entries, as this "last value" is computed for each row (not only the last in each group).

If should work, if You can guarantee, there will be only one row with APP_START_DT = MAX(APP_START_DT)OVER(PARTITION BY ID), or if more such rows are present, then they are at least consistent for selected columns. If You cannot guarantee it, I would consider adding more columns to ORDER BY clauses (which row should really be the first).

score 0 · Answer 4 · answered Feb 11 '19 at 05:01

select * 
FROM table1
where (id, app_start_dt) IN (
    SELECT ap1.ID, MAX(ap1.APP_START_DT)
    FROM table1 ap1
    WHERE ap1.ID in ('444','555')
    GROUP BY ap1.ID 
    )   ;

Javier was close. I believe it works if you use his query as a sub select. Would be nice to know what the data looks like and if duplicates exist for any given date as that would break this approach.

score 0 · Answer 5 · answered Feb 11 '19 at 06:22

SELECT *
FROM
(
    SELECT *
    FROM mytable
    WHERE id IN(444, 555)
) T1
INNER JOIN
(
    SELECT id, 
           MAX(APP_START_DT) APP_START_DT
    FROM mytable
    WHERE id IN(444, 555)
    GROUP BY ID
) T2 ON T1.id = T2.ID
        AND T1.APP_START_DT = T2.APP_START_DT;

Return 1 row per person

5 Answers5