Pandas DataFrame: Apply function cell-wise with index and column values as arguments

Question

I am trying to prepare some data for a heatmap or 3D plot. The general idea is that I have some function z=f(x,y) where z is the value of a specific cell with x as its column value and y as its index value.

My current approach is to loop over the dataframe which already shows the desired result:

import numpy as np
import pandas as pd


def my_fun(a, b):
    return(a**2 + b**3)

index = [i for i in np.arange(25.0, 100.0, 25.0)]
columns = [i for i in np.arange(150.0, 600.0, 150.0)]
df = pd.DataFrame(np.zeros((3, 3)), index=index, columns=columns)

for idx in index:
    for col in columns:
    df.loc[idx, col] = my_fun(idx, col)

print(df)

and yields:

      150.0       300.0       450.0
25.0  3375625.0  27000625.0  91125625.0
50.0  3377500.0  27002500.0  91127500.0
75.0  3380625.0  27005625.0  91130625.0

But looping over the dataframe is probably not the right (vectorized) way to deal with this problem and I was looking for some pretty combination of apply/applymap/map.

Is there any way to get the same result in a smarter/vectorized way?

Thanks in advance!

score 4 · Accepted Answer · answered Feb 20 '17 at 15:44

4

You can use:

#if need only some easy arithmetic operation like sum
print (df.apply(lambda x: x.index + x.name, axis=1))
   1  2  3
1  2  3  4
2  3  4  5
3  4  5  6

If need your function working with scalars, is possible stack for Series, convert to df, apply function and last unstack:

df1 = df.stack().to_frame().apply(lambda x: my_fun(x.name[0], x.name[1]), axis=1).unstack()
print (df1)
   1  2  3
1  2  3  4
2  3  4  5
3  4  5  6

For testing is best instead lambda use some custom function like:

def f(x):
    print (x.name)
    print (x.index)
    return x.index + x.name
1
Int64Index([1, 2, 3], dtype='int64')
1
Int64Index([1, 2, 3], dtype='int64')
2
Int64Index([1, 2, 3], dtype='int64')
3
Int64Index([1, 2, 3], dtype='int64')

print (df.apply(f, axis=1))

   1  2  3
1  2  3  4
2  3  4  5
3  4  5  6

answered Feb 20 '17 at 15:44

jezrael

822,522
95
1,334
1,252

Works like a charm! Thanks. I have totally missed that I could use the .name and .index attributes of the cell! – Cord Kaldemeyer Feb 20 '17 at 15:59
Thanks. If you find it useful, you can also upvote the question since I haven't found anything on this in my search for an answer – Cord Kaldemeyer Feb 20 '17 at 16:02
One more question: Is it also possible to pass additional arguments with default arguments to f() such as f(x, y=0.5, z=4)? – Cord Kaldemeyer Feb 20 '17 at 16:16
Unfortunately I dont know... :( – jezrael Feb 20 '17 at 16:20
I found it here: http://stackoverflow.com/questions/12182744/python-pandas-apply-a-function-with-arguments-to-a-series – Cord Kaldemeyer Feb 22 '17 at 12:48
@CordKaldemeyer - Thanks, wau, it is really interesting. – jezrael Feb 22 '17 at 13:08

score 0 · Answer 2 · answered Feb 20 '17 at 13:07

Indeed you can simply leverage the apply function to operate column-wise knowing that the column index is always available because the column is a pandas.Series:

import numpy as np
import pandas as pd


def my_fun(col):
    # both are numpy arrays, col.values gives the inner value of the whole column
    # operations here use the fast numpy primitives
    return col.index + col.values  

index = [i for i in range(1, 4)]
columns = ['col' + str(i) for i in range(1, 4)]
df = pd.DataFrame(np.random.randint(1, 10, (3, 3)), index=index, columns=columns)

col_names = ['col1', 'col2']  # alternatively you can use an array of columns indices such as [1, 2]
df[col_names].apply(my_fun)
print(df)

I think this only works if I want to calculate the cell values based on the former value and the index value but not based on the column value. Maybe my question wasn't formulated clearly. I have adapted the code! — Cord Kaldemeyer, Feb 20 '17 at 15:51

Pandas DataFrame: Apply function cell-wise with index and column values as arguments

2 Answers2