How do I reference the dataframe to which a function is being applied inside the function applied.
For example, I have a dataframe named name_df. It has 4 columns (no specified index).
I have a function called calculate_stats that takes in several arguments (mixture of integer values and a df).
Inside calculate_stats I want to refer to name_df['name1']
and name_df['name2']
I did:
name_df.apply(calculate_stats, axis=1, args=(r, df,x,y,z))
And inside calculate_stats I use r['name1']
and r['name2']
.
But got an error indicating NameError: name 'r' is not defined
In the following link they apply a function func1 to dataframe df. The argument that references each row in df is specified as r. So inside func1, columns of df can be referred by using r['colname']. How do I do the same with my function?
In [37]: df
Out[37]:
X Y Count
0 0 1 2
1 0 1 2
2 1 1 2
3 1 0 1
4 1 1 2
5 0 0 1
In [38]: def func1(r):
....: print(r['X'])
....: print(r['Y'])
....: return r
....: