Below is my data frame,
Row_ID A B
1 0 0
2 0 0
3 0 0
4 0 1
5 0 1
6 0 1
7 62.75 0
8 100 0
9 100 0
10 100 -1
11 100 -1
12 100 -1
13 100 -1
14 87.625 -1
15 0 0
16 0 0
17 0 1
18 0 1
19 10 1
20 13.43 1
21 67.31 0
22 86.5 0
23 99 0
24 99 0
25 99 0
I need to calculate average of A
every 5 consecutive rows with the below conditions.
And the values 1 and -1
in B
are the two transition states.
a The average of A
is based on the values in B
.
b The data points from A
to be considered for the average calculation starts from the last data point where the transition in B
ends. (Example: Transition in B
ends at Row_ID = 6
, Hence row number 7,8,9,10 and 11 of A
has to be considered for calculating average).
c Referring to the above example, Though there are change in values at row number 10 and 11 of B
, corresponding rows in A
it should be considered for calculation of average because those values were considered in average calculation in the previous iteration.
Is there any workaround code without using embed()
function because my data has around 900K values and since embed()
creates matrix, memory usage wont be that optimal.
Expected Output
Row_ID A B Avg
1 0 0 0
2 0 0 0
3 0 0 0
4 0 1 0
5 0 1 0
6 0 1 0
7 62.75 0 92.55
8 100 0 92.55
9 100 0 92.55
10 100 -1 92.55
11 100 -1 92.55
12 100 -1 0
13 100 -1 0
14 87.625 -1 0
15 0 0 2
16 0 0 2
17 0 1 2
18 0 1 2
19 10 1 2
20 13.43 1 0
21 67.31 0 90.16
22 86.5 0 90.16
23 99 0 90.16
24 99 0 90.16
25 99 0 90.16