Dataframe looping in a specific way

Question

I have a some dataframe object:

I would like to compare for example A and B at n while also using C at n-1 to create a new row, so let's say D = A_n * B_n + C_n-1.

At first I used a generic python loop, but I quickly realized this was very slow with large datasets. Then I started looking at numpy vectorization (which is very fast) but I couldn't figure out a way to get previous entries.

What other alternatives do I have while keeping it nice and fast?

First I think need something like [this](https://stackoverflow.com/a/62261221/2901002), if not `shift` is better here. — jezrael, Aug 16 '20 at 10:49

sushanth · Accepted Answer · 2020-08-16T10:44:36.460

4

Another solution using Series.shift

df['D'] = df.A * df.B + df.C.shift(1, fill_value = 0)

   A  B  C   D
0  0  7  7   0
1  1  3  6  10
2  2  5  5  16
3  3  2  7  11
4  4  4  3  23

edited Aug 16 '20 at 10:44

answered Aug 16 '20 at 10:30

sushanth

8,275
3
17
28

1

Thank you. fillna(0.0) is redundant as shift() has a fill_value parameter, so shift(periods=1, fill_value=0.0). – Duco Aug 16 '20 at 10:40

Dataframe looping in a specific way

1 Answers1