0

I am new to Pandas. Would like to know how to apply a function to two columns in a dataframe and map the output from the function to a new column in the dataframe. Is this at all possible with pandas syntax or should I resort to native Python to iterate over the rows in the dataframe columns to generate the new column?

a b

1 2
3 1
2 9

Question is how to get, for example, the multiplication of the two numbers in a new column c

a b c

1 2 2
3 1 3
2 9 18
c00der
  • 543
  • 1
  • 4
  • 20

3 Answers3

2

You can do with pandas.

For example:

def funcMul(row):
    return row['a']*row['b']

Then,

df['c'] = df.apply(funcMul,1)

Output:

    a   b   c
0   1   2   2
1   3   1   3
2   2   9   18
harvpan
  • 8,571
  • 2
  • 18
  • 36
1

The comment by harvpan shows the simplest way to achieve your specific example, but here is a generic way to do what you asked:

def functionUsedInApply(row):
    """ The function logic for the apply function comes here. 

    row: A Pandas Series containing the a row in df.
    """
    return row['a'] * row['b']

def functionUsedInMap(value):
    """ This function is used in the map after the apply.
    For this example, if the value is larger than 5, 
    return the cube, otherwise, return the square.     

    value: a value of whatever type is returned by functionUsedInApply.
    """
    if value > 5:
        return value**3
    else:
        return value**2

df['new_column_name'] = df.apply(functionUsedInApply,axis=1).map(functionUsedInMap)

The function above first adds columns a and b together and then returns the square of that value for a+b <=5 and the cube of that value for a+b > 5.

1

You can do the following with pandas

import pandas as pd

def func(r):
    return r[0]*r[1]

df = pd.DataFrame({'a':[1,2,3], 'b':[4,5,6]})
df['c'] = df.apply(func, axis = 1)

Also, here is the official documentation https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.apply.html