1

Consider this simple code (numpy.sum in loc_fun is a stand in for a more complicated bivariate function using numpy):

import pandas
import numpy


def loc_fun(A, B):
    return numpy.sum(A[:-1] > B[-1])


df = pandas.DataFrame(numpy.random.normal(0, 1, [100000, 2]), columns=['size_A', 'size_B']).cumsum(axis=0)
df.expanding(2).apply(lambda x: loc_fun(x.size_A.values, x.size_B.values))

The last line in the code above results in an error I cannot make sense of. Basically, I would like to apply loc_fun to an expanding() window of the values in the columns.

user189035
  • 5,589
  • 13
  • 52
  • 112
  • 1
    I think it is problem, because not yet implemented. `expanding` like `rolling` working with each column separately - check `print` + (`def f(x): print (x) return x.sum() df = df.expanding(2).apply(f)`). So unfortunately need custom function without `expanding`. Check also [this](https://stackoverflow.com/q/37486502), maybe help a bit. – jezrael Feb 16 '18 at 07:20
  • 1
    Not sure, try contact author - `pir` is math really clever guy, I hope help you ;) – jezrael Feb 16 '18 at 07:48

1 Answers1

1

In lambda x is a numpy.ndarray so You can not refer to column 'A-values' or 'B_values'.

df.expanding(2).apply(lambda x: print(type(x)))

>><class 'numpy.ndarray'>
CezarySzulc
  • 1,849
  • 1
  • 14
  • 30
  • Check my comment under question, unfortunately it working with each column separately :( – jezrael Feb 16 '18 at 07:21
  • @Cezary.Sz: what version of pandas are you using? the line `df.expanding(2).apply(lambda x: print(type(x)))` itself gives an error here (p0.22) – user189035 Feb 16 '18 at 07:35
  • Error is for second iteration, because lambda get NoneType but in must be real number. – CezarySzulc Feb 16 '18 at 07:40
  • OK, in any case, jezrael's code in the comments does it. Darn: it will be more complicated than I thought. – user189035 Feb 16 '18 at 07:43