1

I have dataframe as below

df = pd.DataFrame([[1,5,'Dog'],[2,6,'Dog'],[3,7,'Cat'],[4,8,'Cat']],columns=['A','B','Type'])
Index A B Type
0 1 5 Dog
1 2 6 Dog
2 3 7 Cat
3 4 8 Cat

Based on the 'Type' column value, I need to apply its own function(for example for Dog rows, call its Dog function and get the value populated in the C & D; likewise for Cat type, call its cat function and create C & D column) and create two new columns C and D returned from these functions.

Finally my dataframe should look like the below

Index A B Type C D
0 1 5 Dog c d
1 2 6 Dog c d
2 3 7 Cat c d
3 4 8 Cat c d

Column C and D are values returned from the functions. For examples here I have given like below.

The problem I face here is -

For each type of 'Type' column value, I am filering the rows and calling it's own function and getting the C and D column but when I merge it back into the original dataframe with left_index=True and Right_index =True, it is creating Column_X and Column_Y for all the columns and this is creating problem when I iterate for the next 'Cat' rows. Please advice how shall I approach this problem

Code

def ext_fun(x1,x2,i):
    if i=='Dog':
        #Do some calc to find c and d value and return back

        return ['c','d']
    if i=='Cat':
        #do some calc to find c and d value and return back
        return ['c','d']
    
for i in df['Type'].unique():
    df1 = df[df.Type==i]
    df1[['C','D']] = df1.apply(lambda x: ext_fun(x['A'],x['B'],i),result_type='expand',axis=1)
    df = pd.merge(df,df1,left_index = True,right_index=True)

Note: I have 10 to 15 types in the column 'Type' with hundreds of records in each type. The values for col C and D are dynamic and require a function. So function call is required based on the Type column value.

Dani Mesejo
  • 61,499
  • 6
  • 49
  • 76
John
  • 99
  • 4

1 Answers1

1

Hope this solves your issue. Does not seem that apply can return into multiple columns so I used a suggestion from here

df = pd.DataFrame([[1,5,'Dog'],[2,6,'Dog'],[3,7,'Cat'],[4,8,'Cat']],columns=['A','B','Type'])

def ext_fun(x1,x2,i):
    if i=='Dog':
        #Do some calc to find c and d value and return back

        return ['c','d']
    if i=='Cat':
        #do some calc to find c and d value and return back
        return ['c','d']
df.loc[:,['C']], df.loc[:,['D']] = zip(*df.apply(lambda x: ext_fun(x['A'],x['B'],x.Type), axis=1))
Nikolay Zakirov
  • 1,505
  • 8
  • 17