I hope find there are experts who can help)
There is such a table
X2 X3 X4 Y Y1
01.02.2019 1 1 1
02.02.2019 2 2 0
02.02.2019 2 3 0
02.02.2019 2 1 1
03.02.2019 1 2 1
04.02.2019 2 3 0
05.02.2019 1 1 1
06.02.2019 2 2 0
07.02.2019 1 3 1
08.02.2019 2 1 1
09.02.2019 1 2 0
10.02.2019 2 3 1
11.02.2019 1 1 0
12.02.2019 2 2 1
13.02.2019 1 3 0
14.02.2019 2 1 1
15.02.2019 1 2 1
16.02.2019 2 3 0
17.02.2019 1 1 1
18.02.2019 2 2 0
And in column Y1 it is necessary to calculate the moving average of column Y for the last 5 days, but only with filtering by condition X3 and X4. The filter is equal to the current value of the columns for the current row.
For example, for the string
02/04/2019 2 3 0
the average will be equal to 0, because for it only the string matches the condition
02.02.2019 2 3 0
How to do this I do not understand, I know that it will be something like
filtered_X4 = df ['X4']. where (condition_1 & condition_2 & condition_3)
But how to set the conditions themselves condition_1,2,3 I do not understand.
Saw many examples when the filter is known, for example
condition_1 = df ['X2']. isin ([2, 3, 5])
but that's not what i need, because my condition values change with the string
How to calculate the mean I know
df ['Y1'] = filtered_X4.shift (1) .rolling (window = 999999, min_periods = 1) .mean ()
but can't configure filtering.
add1: This is the result I'm trying to get:
X2 X3 X4 Y Y1
01.02.2019 1 1 1 NAN
02.02.2019 2 2 0 NAN
02.02.2019 2 3 0 NAN
02.02.2019 2 1 1 NAN
03.02.2019 1 2 1 NAN
04.02.2019 2 3 0 0
05.02.2019 1 1 1 1
06.02.2019 2 2 0 0
07.02.2019 1 3 1 NAN
08.02.2019 2 1 1 NAN
09.02.2019 1 2 0 NAN
10.02.2019 2 3 1 NAN
11.02.2019 1 3 0 1
12.02.2019 2 2 1 NAN
13.02.2019 1 3 0 0
14.02.2019 2 1 1 NAN
15.02.2019 2 2 1 1
16.02.2019 2 3 0 NAN
17.02.2019 1 1 1 NAN
18.02.2019 2 2 0 1
For example, to calculate the average (Y1) of this line:
X2 X3 X4 Y Y1
04.02.2019 2 3 0
I need to take only the strings from the dateframe with X3 = 2 and X4 = 3 and X2 from 30.01.2019 to 03.02.2019