I have the following dataframe (df):
A B T Required col (window = 3)
1 1 0 1
2 3 0 3
3 4 0 4
4 2 1 1 4
5 6 0 0 2
6 4 1 1 0
7 7 1 1 1
8 8 1 1 1
9 1 0 0 1
I would like to add the required column, as followed:
Insert in the current row the previous row value of A
or B
.
If in the last 3
(window
) rows most of time the content of A
column is equal to T
column - choose A
, otherwise - B
. (There can be more columns - so the content of the column with the most times equal to T will be chosen).
What is the most efficient way to do it for big data table.