0

I am trying to use some boolean logic in a function on a dataframe, but get an error:

In [4]:

data={'level':[20,19,20,21,25,29,30,31,30,29,31]}
frame=DataFrame(data)
frame
Out[4]:
level
0   20
1   19
2   20
3   21
4   25
5   29
6   30
7   31
8   30
9   29
10  31

In [35]:

def calculate(x):
    baseline=max(frame['level'],frame['level'].shift(1))#doesnt work
    #baseline=x['level']+4#works
    difftobase=x['level']-baseline
    return baseline, difftobase
frame['baseline'], frame['difftobase'] = zip(*frame.apply(calculate, axis=1))#works

However, this throws the following error at:

baseline=max(frame['level'],frame['level'].shift(1))#doesnt work


ValueError: ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', u'occurred at index 0')

I read How to look back at previous rows from within Pandas dataframe function call? and http://pandas.pydata.org/pandas-docs/stable/gotchas.html but can't figure out how to apply this to my problem?

Community
  • 1
  • 1
DISC-O
  • 300
  • 1
  • 3
  • 13
  • What you are asking for can be achieved by using [`masking and where()`](http://pandas.pydata.org/pandas-docs/stable/indexing.html#the-where-method-and-masking) and [`shift`](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.shift.html#pandas.Series.shift) – EdChum Jan 31 '15 at 19:44
  • I suggest you have a look at those links in my comments try a few things and come back with data, code, expected output, any error tracebacks if you get stuck – EdChum Jan 31 '15 at 19:52
  • after receiving a downvote I am trying to improve the question by removing bigger picture part of it, unnecessary fluff and a second question (moved to another question). As a beginner on this site, constructive comments on any down vote would definitely be appreciated! – DISC-O Feb 07 '15 at 17:20
  • I didn't downvote but if you've posted a new updated question then you should probably delete this, don't worry about the votes it's all part of learning about how to use SO – EdChum Feb 07 '15 at 17:24
  • No, its an edit (improvement hopefully) in place, together with separating out a piece to another question. But I am considering deleting this and reposting from scratch because I doubt a down voted question will get much traction. On the other hand, does deleting and reposting constitute "bad behavior"? Also posted comments don't jibe with the question anymore. Lets see. perhaps Ill leave it there for a day and then repost it. – DISC-O Feb 07 '15 at 19:18
  • Just looking at your problem, does the following work `import numpy as np np.max(frame['level'],frame['level'].shift(1))`? so the reason it throws an error is the standard library max works on single scalar values and not array like variables – EdChum Feb 07 '15 at 19:21
  • thanks Ed, first it was giving unexpected results, but by saying "maximum" it works 'baseline=np.maximum(frame['level'],frame['level'].shift(1))' From what I can tell can do a.max or maximum – DISC-O Feb 07 '15 at 20:23
  • Sure, unless you want to, you gave the main direction? Then again I think I could use the points more ;). Ill post the answer if you don't in the next cpp min, thanks again... – DISC-O Feb 07 '15 at 20:28

1 Answers1

1

Inadequate use of the function max. np.maximum (perhaps np.ma.max as well as per numpy documentation) works. Apparently regular max can not deal with arrays (easily). Replacing

baseline=max(frame['level'],frame['level'].shift(1))#doesnt work

with

baseline=np.maximum(frame['level'],frame['level'].shift(1))

does the trick. I removed the other part to make it easier to read:

In [23]:
#q 1 analysis
def calculate_rowise(x):
    baseline=np.maximum(frame['level'],frame['level'].shift(1))#works
    return baseline
frame.apply(calculate_rowise)

Out[23]:
level
0   NaN
1   20
2   20
3   21
4   25
5   29
6   30
7   31
8   31
9   30
10  31

PS the original problem is hiding another issue that shows up when taking out the shift portion of the function. The return shape doesn't match, but thats another problem, just mentioning it here for full disclosure

DISC-O
  • 300
  • 1
  • 3
  • 13