I have two data frames. One representing when an order was placed and arrived, while the other one represents the working days of the shop.
Days are taken as days of the year. i.e. 32 = 1th February.
orders = DataFrame({'placed':[100,103,104,105,108,109], 'arrived':[103,104,105,106,111,111]})
Out[25]:
arrived placed
0 103 100
1 104 103
2 105 104
3 106 105
4 111 108
5 111 109
calendar = DataFrame({'day':['100','101','102','103','104','105','106','107','108','109','110','111','112','113','114','115','116','117','118','119','120'], 'closed':[0,1,1,0,0,0,0,0,1,1,0,0,0,0,0,1,1,0,0,0,0]})
Out[21]:
closed day
0 0 100
1 1 101
2 1 102
3 0 103
4 0 104
5 0 105
6 0 106
7 0 107
8 1 108
9 1 109
10 0 110
11 0 111
12 0 112
13 0 113
14 0 114
15 1 115
16 1 116
17 0 117
18 0 118
19 0 119
20 0 120
What i want to do is to compute the difference between placed and arrived
x = orders['arrived'] - orders['placed']
Out[24]:
0 3
1 1
2 1
3 1
4 3
5 2
dtype: int64
and subtract one if any day between arrived and placed (included) was a day in which the shop was closed.
i.e. in the first row the order is placed on day 100 and arrived on day 103. the day used are 100, 101, 102, 103. the difference between 103 and 100 is 3. However, since 101 and 102 are days in which the shop is closed I want to subtract 1 for each. That is 3 -1 -1 = 1. And finally append this result on the orders df.