2

Is it possible to do calculations in pandas dataframe based on trailing rows in a different column? Something like this.

frame = pd.DataFrame({'a' : [True, False, True, False],
                  'b' : [25, 22, 55, 35]})

I want the output to be this:

A     B     C
True  25    
False 22   44
True  55   55
False 35   70

Where column C is the same as column B when the trailing row in column A is False and where column C is column B * 2 when the trailing row in column A is True?

user2333196
  • 5,406
  • 7
  • 31
  • 35

1 Answers1

6

You could use the where Series method:

In [11]: frame['b'].where(frame['a'], 2 * frame['b'])
Out[11]:
0    25
1    44
2    55
3    70
Name: b, dtype: int64

In [12]: frame['c'] = frame['b'].where(frame['a'], 2 * frame['b'])

Alternatively you could use apply (but this will usually be slower):

In [21]: frame.apply(lambda x: 2 * x['b'] if x['a'] else x['b'], axis=1

Since you are using the "trailing row" you are going to need to use shift:

In [31]: frame['a'].shift()
Out[31]:
0      NaN
1     True
2    False
3     True
Name: a, dtype: object

In [32]: frame['a'].shift().fillna(False)  # actually this is not needed, but perhaps clearer
Out[32]:
0    False
1     True
2    False
3     True
Name: a, dtype: object

And use the where the other way around:

In [33]: c = (2 * frame['b']).where(frame['a'].shift().fillna(False), frame['b'])

In [34]: c
Out[34]:
0    25
1    44
2    55
3    70
Name: b, dtype: int64

and to change the first row (e.g. to NaN, in pandas we use NaN for missing data)

In [35]: c = c.astype(np.float)  # needs to accept NaN

In [36]: c.iloc[0] = np.nan

In [36]: frame['c'] = c

In [37]: frame
Out[37]:
       a   b   c
0   True  25 NaN
1  False  22  44
2   True  55  55
3  False  35  70
Community
  • 1
  • 1
Andy Hayden
  • 359,921
  • 101
  • 625
  • 535
  • I want to calculate based on the trailing row in column 'a'. So the value in the second row column 'c' is dependent on the value in the second row of column 'b' and the FIRST ROW in column 'a'. I don't how or if it is even possible to take the value for the trailing row. – user2333196 Jul 31 '13 at 09:54
  • the previous row row 2 being the previous row for row 3 – user2333196 Jul 31 '13 at 10:11
  • thanks shift is what i was looking for. now i can find examples in the Pandas book – user2333196 Jul 31 '13 at 10:36