I'm attempting to create a clear linear status for a project. I've been able to take the historical project date details and clear out any unnecessary overlap in status dates but am at the point where I now need to prioritize and delineate the general status of a project. My data is structured as below thanks to this fiddle:
Project # | hist_status | status_start_date | status_end_date |
---|---|---|---|
A74308 | In Progress | 11/17/2020 | 6/8/2021 |
A74308 | Pause | 4/2/2021 | 6/21/2021 |
A74308 | Completed | 6/8/2021 | 6/8/2021 |
A74308 | In Progress | 6/21/2021 | 9/20/2021 |
A74308 | Pause | 9/20/2021 | 1/30/2022 |
A74308 | In Progress | 1/30/2022 | 2/8/2023 |
A74308 | Completed | 4/5/2022 | 4/5/2022 |
A74308 | Completed | 8/16/2022 | 8/16/2022 |
A74308 | Pause | 8/16/2022 | 2/8/2023 |
A project will have multiple workstreams but as long as one of them is "In Progress" then the project is "In Progress." Similarly, if one workstream is "Paused" and the others are "Completed" then the Project is "Paused." If all workstreams are "Completed" then the project is "Completed." Just to nail it down, if one workstream is "In Progress" and another is "Paused" with the final being "Completed" then the project is "In Progress."
So, the result I'm trying to get is below:
Project # | hist_status | status_start_date | status_end_date |
---|---|---|---|
A74308 | In Progress | 11/17/2020 | 6/8/2021 |
A74308 | Pause | 6/9/2021 | 6/20/2021 |
A74308 | In Progress | 6/21/2021 | 9/20/2021 |
A74308 | Pause | 9/21/2021 | 1/29/2022 |
A74308 | In Progress | 1/30/2022 | 2/8/2023 |
And here is a visual representation:
This post seems to accomplish what I'm trying to do but I haven't been able to replicate the results in Postgres. I've attempted using subqueries and essentially a DATEDIFF
to delineate, for instance, where the first "Pause" should begin, but can't nail down a solution.