How to iterate over pandas data frame rows while referring to other rows?

Question

In my code I assign a value to a cell in a data frame, based on another value in the same data frame but in another row.

The code, using a for-loop is as follows:

df = pd.DataFrame({'A':[1, 2, 3],'B':[4, 5, 6]})
for i in range(1, df.shape[0]):
    df.loc[i, 'C'] = df.loc[i-1, 'B']

Output:
    A   B   C
0   1   4   NaN
1   2   5   4.0
2   3   6   5.0

This code gives me the output I want, but the code is rather slow. I read about df.itterrows and df.apply but I cannot find out how this can work for my code since I refer to other rows. Does anyone know a faster way to iterate over rows, referring to other rows in the pandas data frame?

Refer [How to Create minimal-reproducible-example](https://stackoverflow.com/help/minimal-reproducible-example) — Sociopath, Feb 04 '20 at 10:08
looks like you're just doing `df['C'] = df['B'].shift()` ..? — Chris Adams, Feb 04 '20 at 10:23
The shift function indeed works very well and is much faster, thank you! — Sjoerd, Feb 04 '20 at 10:56

score 0 · Answer 1 · answered Feb 04 '20 at 10:21

0

Change your code this way

df.iloc[i]['C'] = df.iloc[i-1]['B']

answered Feb 04 '20 at 10:21

ikibir

456
4
12

This does not seem to do anything to my data frame. However, as can be found in the comment of Chris A above, ```df['C'] = df['B'].shift()``` works well and is much faster than my previous code – Sjoerd Feb 04 '20 at 10:55

How to iterate over pandas data frame rows while referring to other rows?

1 Answers1