I have a vector that I want to apply a pearson correlation to all rows of a pandas data frame. I am trying the following:
df.apply(apply_func, axis=1, args=(np.array([1,2,3])), raw=True)
Apply func simply takes two numpy
arrays and calculates the correlation
def apply_func(v1, v2):
#do stuff
However I get the following error when I try to run this
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
I've set breakpoints in apply_func
and I never get inside it. I'm sure I'm using this structure incorrectly but I'm not sure what it is. I would think that each row of df
would be passed to apply_func
as the first positional argument, and whatever is in args
would take up the rest. Is this not correct?
EDIT I have created a simple example below, in this example the apply_func
function should just add the two vectors. Still creates the same errors
data = {'k1': [1, 2, 3], 'k2': [4, 5, 6], 'k3': [7, 8, 9]}
df = pd.DataFrame(data)
def apply_func(v1, v2):
return v1 + v2
df.apply(apply_func, axis=1, args=(np.array([1,2,3])), raw=True)