2

I have a dataframe:

import numpy as np
import pandas as pd
np.random.seed(18)
df = pd.DataFrame(np.random.randint(0,50,size=(10, 2)), columns=list('AB'))
df['Min'] = np.nan
n = 3   # can be changed

enter image description here

I need to fill column 'Min' with minimum values of next n enrties of column 'B': enter image description here

Currently I do it using iteration:

for row in range (0, df.shape[0]-n):
    low = []
    for i in range (1, n+1):
        low.append(df.loc[df.index[row+i], 'B'])
    df.loc[df.index[row], 'Min'] = min(low)

But it is quite a slow process. Is there more efficient way, please? Thank you.

Karifan
  • 73
  • 5

2 Answers2

4

Use rolling with min and then shift:

df['Min'] = df['B'].rolling(n).min().shift(-n)
print (df)
    A   B   Min
0  42  19   2.0
1   5  49   2.0
2  46   2  17.0
3   8  24  17.0
4  34  17  11.0
5   5  21   4.0
6  47  42   1.0
7  10  11   NaN
8  36   4   NaN
9  43   1   NaN

If performance is important use this solution:

def rolling_window(a, window):
    shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
    strides = a.strides + (a.strides[-1],)
    return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
arr = rolling_window(df['B'].values, n).min(axis=1)
df['Min'] = np.concatenate([arr[1:], [np.nan] * n])
print (df)
    A   B   Min
0  42  19   2.0
1   5  49   2.0
2  46   2  17.0
3   8  24  17.0
4  34  17  11.0
5   5  21   4.0
6  47  42   1.0
7  10  11   NaN
8  36   4   NaN
9  43   1   NaN
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
3

Jez`s got it. Just as another option, you can also do a forward rolling through the Series (as suggested by Andy here)

df.B[::-1].rolling(3).min()[::-1].shift(-1)

0     2.0
1     2.0
2    17.0
3    17.0
4    11.0
5     4.0
6     1.0
7     NaN
8     NaN
9     NaN
rafaelc
  • 57,686
  • 15
  • 58
  • 82