0

I am trying to iterate a dataframe and update the previous row on each iteration. My index are date based.

for i, row in df.iterrows():
    # do my calculations here
    myValue = row['myCol']
    prev = row.shift()
    prev['myCol'] = prev['myCol'] - myValue

There is no error but it does not save, how do I save this?

Bill Software Engineer
  • 7,362
  • 23
  • 91
  • 174
  • 1
    Does this answer your question? [Update a dataframe in pandas while iterating row by row](https://stackoverflow.com/questions/23330654/update-a-dataframe-in-pandas-while-iterating-row-by-row) – zoldxk Apr 28 '21 at 15:47
  • I saw that, the main problem is since I want to update the previous row, and I don't want to calculate the new time index as it's expensive, I was hoping to just refer to the previous row instead. – Bill Software Engineer Apr 28 '21 at 15:48
  • 3
    In general **don't** iterate, try to *do my calculation* as a vectorized funcion and shift. It's really hard to imagine what you are trying to do without a sample data and expected output. I imagine something with `shift` and `cumsum` would do. – Quang Hoang Apr 28 '21 at 15:49
  • 1
    @BillSoftwareEngineer, it would make more sense if you could illustrate with a [`reproducible example`](https://stackoverflow.com/q/20109391/4985099) – sushanth Apr 28 '21 at 15:50
  • Something like that would to the job: `df["myCol"].shift().sub(df["myCol"])` – Corralien Apr 28 '21 at 17:05

1 Answers1

1

Without example data, it's unclear what you're trying. But using the operations in your for loop, it could probably be done like this instead, without any loop:

myValue = df['myCol']  # the column you wanted and other calculations
df['myCol'] = df['myCol'].shift() - myValue

Depending on what you're trying, one of these should be what you want:

# starting with this df
   myCol  otherCol
0      2         6
1      9         3
2      4         8
3      2         8
4      1         7

# next row minus current row
df['myCol'] = df['myCol'].shift(-1) - df['myCol']
df
# result:
   myCol  otherCol
0    7.0         6
1   -5.0         3
2   -2.0         8
3   -1.0         8
4    NaN         7

or

# previous row minus current row
df['myCol'] = df['myCol'].shift() - df['myCol']
df
# result:
   myCol  otherCol
0    NaN         6
1   -7.0         3
2    5.0         8
3    2.0         8
4    1.0         7

And myVal can be anything, like some mathematical operations vectorised over an entire column:

myVal = df['myCol'] * 2 + 3
# myVal is:
0     7
1    21
2    11
3     7
4     5
Name: myCol, dtype: int32

df['myCol'] = df['myCol'].shift(-1) - myVal
df
   myCol  otherCol
0    2.0         6
1  -17.0         3
2   -9.0         8
3   -6.0         8
4    NaN         7
aneroid
  • 12,983
  • 3
  • 36
  • 66