I have the problem that the code from below is extremely slow. I haven't been working with Python and Pandas that long, so I don't know exactly where to start.
I want to determine the predecessor and successor of each row.
Currently I iterate over each row and output the rows that meet my conditions. From these series I determine the maximum and the minimum once.
I have the following record:
index Case Button Start rowNow
0 x a 2017-12-06 10:17:43.227 0
1 x b 2017-12-06 10:17:44.876 1
2 x c 2017-12-06 10:17:45.719 2
3 y a 2017-12-06 15:28:57.500 3
4 y e 2017-12-06 15:29:19.079 4
And I want to get it:
index Case Button Start rowNow prevNum nextNum
0 x a 2017-12-06 10:17:43.227 0 NaN 1
1 x b 2017-12-06 10:17:44.876 1 0 2
2 x c 2017-12-06 10:17:45.719 2 1 NaN
3 y a 2017-12-06 15:28:57.500 3 NaN 4
4 y e 2017-12-06 15:29:19.079 4 3 NaN
Could someone give me some tips on how to optimize the speed of this code? Can vectorization be used here at all?
for index, row in df.iterrows():
x = df[(df['Case'] == row['Case']) & (df['rowNow'] < row['rowNow']) & (row['Start'] >= df['Start'])]
df.loc[index,'prevNum'] = x['rowNow'].max()
y = df[(df['Case'] == row['Case']) & (df['rowNow'] > row['rowNow']) & (row['Start'] <= df['Start'])]
df.loc[index,'nextNum'] = y['rowNow'].min()