I have two dataframes, one with an estimated daily value, and another with the closed value for the month.
I need to show the estimated daily value ONLY when the closed value for the month does not exist.
Example:
df1:
DATA | ID | VALUE | DSC |
---|---|---|---|
2022-01-31 | 123 | 10 | CLOSED MONTH |
2022-02-31 | 123 | 20 | CLOSED MONTH |
2022-03-31 | 999 | 30 | CLOSED MONTH |
2022-04-31 | 999 | 40 | CLOSED MONTH |
df2:
DATA | ID | VALUE | DSC |
---|---|---|---|
2022-01-31 | 123 | 50 | ESTIMATED DAY |
2022-02-31 | 123 | 60 | ESTIMATED DAY |
2022-03-31 | 123 | 70 | ESTIMATED DAY |
2022-04-31 | 123 | 80 | ESTIMATED DAY |
2022-03-20 | 123 | 90 | ESTIMATED DAY |
2022-03-25 | 123 | 100 | ESTIMATED DAY |
2022-04-30 | 999 | 120 | ESTIMATED DAY |
2022-05-02 | 999 | 150 | ESTIMATED DAY |
2022-05-03 | 999 | 200 | ESTIMATED DAY |
EXPECTED OUTPUT:
DATA | ID | VALUE | DSC |
---|---|---|---|
2022-01-31 | 123 | 10 | CLOSED MONTH |
2022-02-31 | 123 | 20 | CLOSED MONTH |
2022-03-31 | 999 | 30 | CLOSED MONTH |
2022-04-31 | 999 | 40 | CLOSED MONTH |
2022-03-20 | 123 | 90 | ESTIMATED DAY -Because closed month 3 has different ID |
2022-03-25 | 123 | 100 | ESTIMATED DAY -Because closed month 3 has different ID |
2022-05-02 | 999 | 150 | ESTIMATED DAY -Because there is no closed month 5 |
2022-05-03 | 999 | 200 | ESTIMATED DAY -Because there is no closed month 5 |
Does anyone know a solution?
I tried using window function Row_number, rank and dense_rank, but it didn't work.