I have a DF with data like:
Identificator | Date | Status |
---|---|---|
ID1 | 2021-05-02 19:55:43 | OK |
ID2 | 2021-05-02 19:48:01 | FAILED |
ID3 | 2021-05-02 19:47:53 | OK |
ID1 | 2021-05-03 19:55:43 | FAILED |
ID2 | 2021-05-03 20:48:01 | OK |
ID3 | 2021-05-03 19:47:53 | OK |
ID1 | 2021-05-04 19:55:43 | FAILED |
ID2 | 2021-05-04 19:48:01 | OK |
ID3 | 2021-05-04 19:47:53 | OK |
The lines are actually thousands and the IDs are hundreds, but this is not relevant. Let's assume that they are 1 event (line) per ID per day. I want to be able to take only X amount of days back - for example last 3 days and to create grid with number of days back as X axis, IDs as Y axis. Cells are days in a row without success:
X | 3 | 2 | 1 |
---|---|---|---|
ID1 | 0 | 1 | 2 |
ID2 | 1 | 0 | 0 |
ID3 | 0 | 0 | 0 |
ID1 on 3 days back is 0, because there is a success on that day. But the same ID has 2 failures after that, so day 1 (the most recent day) is 2 in the cell for this ID.
The idea is to have a Dataframe ready for some kind of heatmap visualization on a later stage, but even in colored xlsx to have visual sense - where the lack of success is more severe, and where is OK
If we imagine ID4 with longer history:
OK,FAIL,OK,OK,FAIL,FAIL,OK,FAIL,FAIL,FAIL,OK
the row should look like
0,1,0,0,1,2,0,1,2,3,0
It's just like a counter of failures in a row which resets to 0 with every OK.
Any help is greatly appreciated