9

I have a pandas Dataframe in the form:

            A           B       K      S
2012-03-31  NaN         NaN     NaN    10
2012-04-30  62.74449    15.2    71.64   0
2012-05-31  2029.487    168.8   71.64   0
2012-06-30  170.7191    30.4    71.64   0

I trying to create a function that replace df['S'] using df['S'][index-1] value.

for example:

for index,row in df.iterrows:
     if index = 1: 
         pass
     else:
         df['S'] = min(df['A'] + df['S'][index-1]?? - df['B'], df['K'])

but i dont know how to get df['S'][index - 1]

  • 1
    For current/previous rows using `iterrows()`, see also, [iterrows pandas get next rows value](https://stackoverflow.com/a/23151722/343215). – xtian Feb 24 '18 at 23:34

3 Answers3

9

It looks like your initial answer is pretty close.

The following should work:

for index, row in df.iterrows():
    if df.loc[index, 'S'] != 0:
        df.loc[index, 'S'] = df.loc[str(int(index) - 1), 'S']

Essentially, for all but the first index, i.e. 0, change the value in the 'S' column to the value in the row before it. Note: This assumes a dataframe with a sequential, ordered index.

The iterrows() method doesn't let you modify the values by calling the row on its own, hence you need to use df.loc() to identify the cell in the dataframe and then change it's value.

Also worth noting that index is not an integer, hence the the use of the int() function to subtract 1. This is all within the str() function so that the final index output is a string, as expected.

Fab Dot
  • 504
  • 1
  • 5
  • 16
4

The point of iterrows is to operate one row at a time, so you won't be able to access prior rows. Your function will be slow anyways, and there's a much faster way:

df['S_shifted'] = df.S.shift()

compared = pd.concat([df['A'] + df['S_shifted'] - df['B'], df['K']], axis=1)

df['S'] = compared.min(axis=1)

In [29]: df['S']
Out[29]: 
2012-03-31         NaN
2012-04-30    57.54449
2012-05-31    71.64000
2012-06-30    71.64000
Name: S, dtype: float64
TomAugspurger
  • 28,234
  • 8
  • 86
  • 69
1

Another approach can be:

for (index, row), ii in zip(df.iterrows(), range(len(df.index))):
  # index: current row index
  # row: current row
  # df.iloc[ii-1]: prv row (of course make sure, prv row is present)
  # df.iloc[ii+1]: next row (of course make sure, next row is present)
Dr.PB
  • 959
  • 1
  • 13
  • 34