0

I want to create a new column using the following loop. The table just has the columns 'open', and 'start'. I want to create a new column 'startopen', where if 'start' equals 1, then 'startopen' is equal to 'open'. Otherwise, 'startopen' is equal to whatever 'startopen' was in the row above of this newly created column. Currently I'm able to achieve this using the following:

for i in range(df.shape[0]):
    if df['start'].iloc[i] == 1:
        df.loc[df.index[i],'startopen'] = df.loc[df.index[i],'open']
    else:
        df.loc[df.index[i],'startopen'] = df.loc[df.index[i-1],'startopen']

This works, but is very slow for large datasets. Are there any built in functions that can do this faster?

batataman
  • 31
  • 6

1 Answers1

2

I want to create a new column 'startopen', where if 'start' equals 1, then 'startopen' is equal to 'open'

Otherwise, 'startopen' is equal to whatever 'startopen' was in the row above of this newly created column.

IIUC, otherwise part is equal to forward fill the not 1 startopen with last equal 1 startopen

df['startopen'] = pd.Series(np.where(df['start'].eq(1), df['open'], np.nan), index=df.index).ffill()
Ynjxsjmh
  • 28,441
  • 6
  • 34
  • 52