1

I'm attempting to create a clear linear status for a project. I've been able to take the historical project date details and clear out any unnecessary overlap in status dates but am at the point where I now need to prioritize and delineate the general status of a project. My data is structured as below thanks to this fiddle:

Project # hist_status status_start_date status_end_date
A74308 In Progress 11/17/2020 6/8/2021
A74308 Pause 4/2/2021 6/21/2021
A74308 Completed 6/8/2021 6/8/2021
A74308 In Progress 6/21/2021 9/20/2021
A74308 Pause 9/20/2021 1/30/2022
A74308 In Progress 1/30/2022 2/8/2023
A74308 Completed 4/5/2022 4/5/2022
A74308 Completed 8/16/2022 8/16/2022
A74308 Pause 8/16/2022 2/8/2023

A project will have multiple workstreams but as long as one of them is "In Progress" then the project is "In Progress." Similarly, if one workstream is "Paused" and the others are "Completed" then the Project is "Paused." If all workstreams are "Completed" then the project is "Completed." Just to nail it down, if one workstream is "In Progress" and another is "Paused" with the final being "Completed" then the project is "In Progress."

So, the result I'm trying to get is below:

Project # hist_status status_start_date status_end_date
A74308 In Progress 11/17/2020 6/8/2021
A74308 Pause 6/9/2021 6/20/2021
A74308 In Progress 6/21/2021 9/20/2021
A74308 Pause 9/21/2021 1/29/2022
A74308 In Progress 1/30/2022 2/8/2023

And here is a visual representation:

Linear Project Status

This post seems to accomplish what I'm trying to do but I haven't been able to replicate the results in Postgres. I've attempted using subqueries and essentially a DATEDIFF to delineate, for instance, where the first "Pause" should begin, but can't nail down a solution.

plankton
  • 369
  • 5
  • 21

1 Answers1

0

I may have overfitted this solution but it works for this example. Edit: I did indeed overfit the solution but just to an individual project #. Had to adjust when I opened up the query to all projects.

F AS (
   SELECT project_#, row_number() over (PARTITION BY project_#) P_ID,
      CASE WHEN Status_Rank = 1 THEN Status_End_Date
           WHEN Status_End_Date < LAG(Status_End_Date - INTERVAL '1 day', 1) OVER (PARTITION BY project_#ORDER BY Status_Start_Date) THEN NULL
           WHEN LAG(Status_End_Date - INTERVAL '1 day', 1) OVER (PARTITION BY project_# ORDER BY Status_Start_Date) < LAG(Status_End_Date - INTERVAL '1 day', 2) OVER (PARTITION BY project_# ORDER BY Status_Start_Date) THEN NULL
           ELSE Status_End_Date END FLAG
FROM final
), PS AS (
SELECT fi.project_#, Hist_Status, Status_Rank, fi.P_ID,
   CASE WHEN LAG(Status_End_Date, 1) OVER (PARTITION BY fi.project_# ORDER BY Status_Start_Date) IS NULL THEN Status_Start_Date
       WHEN Status_Rank = 1 AND LAG(STATUS_RANK, 1) OVER (PARTITION BY fi.project_# ORDER BY Status_Start_Date) = 2 THEN LAG(Status_End_Date, 1) OVER (PARTITION BY fi.project_# ORDER BY Status_Start_Date) :: DATE
       WHEN Status_Rank = 1 AND LAG(STATUS_RANK, 1) OVER (PARTITION BY fi.project_id ORDER BY Status_Start_Date) = 3 THEN Status_Start_Date
       WHEN Status_End_Date < LAG(Status_End_Date, 1) OVER (PARTITION BY fi.project_# ORDER BY Status_Start_Date) THEN NULL
       WHEN LAG(Status_End_Date, 1) OVER (PARTITION BY fi.project_# ORDER BY Status_Start_Date) < LAG(Status_End_Date + INTERVAL '1 day', 2) OVER (PARTITION BY fi.project_# ORDER BY Status_Start_Date) THEN NULL
       WHEN Status_End_Date = LAG(Status_End_Date, 1) OVER (PARTITION BY fi.project_# ORDER BY Status_Start_Date) AND Status_Start_Date > LAG(Status_Start_Date, 1) OVER (PARTITION BY fi.project_# ORDER BY Status_Start_Date) THEN NULL
       ELSE LAG(Status_End_Date, 1) OVER (PARTITION BY fi.project_# ORDER BY Status_Start_Date) :: DATE END AS Status_Start_Date,
   CASE WHEN Status_End_Date < LAG(Status_End_Date - INTERVAL '1 day', 1) OVER (PARTITION BY fi.project_#  ORDER BY Status_Start_Date) THEN NULL
        WHEN Status_Rank = 3 THEN Status_End_Date
        WHEN Status_End_Date = LAG(Status_End_Date, 1) OVER (PARTITION BY fi.project_# ORDER BY Status_Start_Date) AND Status_Start_Date > LAG(Status_Start_Date, 1) OVER (PARTITION BY fi.project_# ORDER BY Status_Start_Date) THEN NULL
    ELSE (Status_End_Date - INTERVAL '1 day') :: DATE END AS Status_End_Date
FROM final fi
LEFT JOIN F
ON fi.project_# = F.project_# 
AND fi.P_ID = F.P_ID
WHERE F.FLAG IS NOT NULL
)

SELECT project_#, Hist_Status, Status_Start_Date, Status_End_Date,
   DATE_TRUNC('WEEK', Status_Start_Date) :: DATE Week_Start_Date, DATE_TRUNC('WEEK', Status_End_Date) :: DATE Week_End_Date
FROM PS
WHERE Status_Start_Date IS NOT NULL
plankton
  • 369
  • 5
  • 21