I need to iterate over a pandas dataframe in order to pass each row as argument of a function (actually, class constructor) with **kwargs
. This means that each row should behave as a dictionary with keys the column names and values the corresponding ones for each row.
This works, but it performs very badly:
import pandas as pd
def myfunc(**kwargs):
try:
area = kwargs.get('length', 0)* kwargs.get('width', 0)
return area
except TypeError:
return 'Error : length and width should be int or float'
df = pd.DataFrame({'length':[1,2,3], 'width':[10, 20, 30]})
for i in range(len(df)):
print myfunc(**df.iloc[i])
Any suggestions on how to make that more performing ? I have tried iterating with tried df.iterrows()
,
but I get the following error :
TypeError: myfunc() argument after ** must be a mapping, not tuple
I have also tried df.itertuples()
and df.values
, but either I am missing something, or it means that I have to convert each tuple / np.array to a pd.Series or dict , which will also be slow.
My constraint is that the script has to work with python 2.7 and pandas 0.14.1.