I have a seven column data frame that contains many thousands of rows. Below is a sample. The seventh column is populated dynamically based on the values in the other six columns in the current row and the row above it.
The nested np.where statement I came up with works but it's not very flexible or easy to read. I explored using np.select
but I don't have a finite set of values ex-ante that I can assign. Technically, they can be any real number.
So I'm wondering if there's a better way to perform this type of conditional vectorized operation given the constraint?
df['G'] = np.where(df['D'].eq(0) & df['D'].shift(1).eq(0) |
df['D'].ne(0) & df['D'].shift(1).ne(0), df['C'],
np.where(df['D'].eq(0) & df['D'].shift(1).eq(-1) &
(df['B'] <= df['E'].shift(1)) & (df['E'].shift(1) <= df['A']), df['E'].shift(1),
np.where(df['D'].eq(0) & df['D'].shift(1).eq(-1) &
(df['A'] >= df['F'].shift(1)) & (df['F'].shift(1) >= df['B']), df['F'].shift(1),
np.where(df['D'].eq(0) & df['D'].shift(1).eq(1) &
(df['B'] <= df['E'].shift(1)) & (df['E'].shift(1) <= df['A']), df['E'].shift(1),
np.where((df['A'] >= df['F'].shift(1)) & (df['F'].shift(1) >= df['B']),
df['F'].shift(1), df['C'])))))
df
A B C D E F G
Date
2021-05-17 126.93 125.17 125.90 0.0 0.00 0.00 125.90
2021-05-18 126.99 124.78 124.48 0.0 0.00 0.00 124.48
2021-05-19 124.92 122.86 124.32 1.0 122.86 135.15 124.32
2021-05-20 127.72 125.10 126.93 1.0 122.86 135.15 126.93
2021-05-21 128.00 125.21 125.06 -1.0 128.00 115.20 125.06
2021-05-24 127.94 125.94 126.72 -1.0 128.00 115.20 126.72
2021-05-25 128.32 126.32 126.53 0.0 0.00 0.00 128.00
2021-05-26 127.39 126.42 126.48 0.0 0.00 0.00 126.48
2021-05-27 127.64 125.08 124.91 0.0 0.00 0.00 124.91
2021-05-28 125.80 124.55 124.24 -1.0 125.80 113.22 124.24
2021-06-01 125.35 123.94 123.91 -1.0 125.80 113.22 123.91
2021-06-02 125.24 124.05 124.69 1.0 124.05 136.46 124.69
2021-06-03 124.85 123.13 123.18 0.0 0.00 0.00 124.05
2021-06-04 126.16 123.85 125.52 1.0 123.85 136.23 125.52
2021-06-07 126.32 124.83 125.53 1.0 123.85 136.23 125.53
2021-06-08 128.46 126.21 126.37 1.0 123.85 136.23 126.37