1
eid start            end          status
6158963 11/27/2016  1/7/2017    FT
6158963 1/8/2017    5/9/2017    FT
6158963 5/10/2017   5/20/2017   LA
6158963 5/21/2017   7/31/2017   LA
6158963 8/1/2017    9/9/2017    FT
6158963 9/10/2017   10/21/2017  FT
6158963 10/22/2017  12/2/2017   FT
6158963 12/3/2017   12/16/2017  FT
6158963 12/17/2017  12/30/2017  FT
6158963 12/31/2017  3/3/2018    FT
6158963 3/4/2018    4/8/2018    FT

.

Above is the sample data which I am trying to convert. I am trying to convert the data based on the status.

The data should look like below :

eid start              end        status
6158963 11/27/2016  5/9/2017    FT
6158963 5/10/2017   7/31/2017   LA
6158963 8/1/2017    4/8/2018    FT

I am looking to group with the status field in the given order. but when I use max on the begin and end and group it by status. It just groups all the status into one.

eid MIN             MAX      status
6158963 11/27/2016  4/8/2018    FT
6158963 5/10/2017   7/31/2017   LA
user4321
  • 605
  • 3
  • 14
  • 37
green
  • 13
  • 2

1 Answers1

2

You can use row_number() & do aggregation based on differences of sequence generated via row_number() :

select eid, min(startdate) as startdate, max(endate) as endate, status
from (select t.*,
             row_number() over (partition by eid order by startdate) as seq1,
             row_number() over (partition by eid, status order by startdate) as seq2
      from table t
     ) t
group by eid, status, (seq1 - seq2)
order by startdate;

Assuming the startdate & enddates are in reasonable format.

Yogesh Sharma
  • 49,870
  • 5
  • 26
  • 52
  • Convert dates to the sortable format first like this: `from_unixtime(unix_timestamp(start,'MM/dd/yyyy'),'yyyy-MM-dd') as start` – leftjoin Oct 12 '18 at 14:29