I have following table, which apart from other attributes contains:
- Customer ID - unique identifier
- Value
- CreatedDate - when the record has been created (based on ETL)
- UpdatedDate - until when the record has been valid
Since there are other attributes apart from the [Value], which are being tracked for historical values, there might be cases, where there are multiple rows with the same [Value] for the same customer, but different timestamps in [CreatedDate] / [UpdatedDate]. Thus, the data may look like:
Customer ID | Value | CreatedDate | UpdatedDate |
---|---|---|---|
1 | 111 | 04/08/2021 15:00 | 04/08/2021 17:00 |
1 | 111 | 01/08/2021 09:00 | 04/08/2021 15:00 |
1 | 222 | 20/07/2021 01:30 | 01/08/2021 09:00 |
1 | 222 | 01/06/2021 08:00 | 20/07/2021 01:30 |
1 | 111 | 01/04/2021 07:15 | 01/06/2021 08:00 |
2 | 333 | 03/08/2021 04:30 | 04/08/2021 17:00 |
2 | 444 | 23/07/2021 01:20 | 03/08/2021 04:30 |
2 | 444 | 01/04/2021 13:50 | 23/07/2021 01:20 |
I would like to keep the unique [Values] in correct sequence, hence keep the [Value] for the earliest [CreatedDate], however, if Customer had originally Value1, then changed it to Value2 and finally, changed back to Value1. I would like to keep these 2 changes as well. Hence the ideal output should look like:
Customer ID | Value | CreatedDate | UpdatedDate |
---|---|---|---|
1 | 111 | 01/08/2021 09:00 | 04/08/2021 17:00 |
1 | 222 | 01/06/2021 08:00 | 01/08/2021 09:00 |
1 | 111 | 01/04/2021 07:15 | 01/06/2021 08:00 |
2 | 333 | 03/08/2021 04:30 | 04/08/2021 17:00 |
2 | 444 | 01/04/2021 13:50 | 03/08/2021 04:30 |
Based on CreatedDate / UpdatedDate identify, the chronological sequence of changes and identify the earliest CreatedDate and latest UpdatedDate. However, if particular value appeared multiple times, but has been interspersed by different value, I would like to keep it too.
I've tried the below approach and it works fine however it does not work for the scenario above and the output look like:
SELECT [Customer ID]
,Value
,MIN(CreatedDate) as CreatedDate
,MAX(UpdatedDate) as UpdatedDate
FROM #History
GROUP BY ID, Value
Customer ID | Value | CreatedDate | UpdatedDate |
---|---|---|---|
1 | 111 | 01/04/2021 07:15 | 04/08/2021 17:00 |
1 | 222 | 01/06/2021 08:00 | 01/08/2021 09:00 |
2 | 333 | 03/08/2021 04:30 | 04/08/2021 17:00 |
2 | 444 | 01/04/2021 13:50 | 03/08/2021 04:30 |
Any ideas, please? I've tried using LAG and LEAD as well, but was not able to make it work either.