-2

I have two dataframes like below , df1 and df2 has many columns , reduced it for better readability

        df1                   df2 
   id   A    B    C        ID   A        
    1   x    y    z         1   m1      
    2   x1   y1   z1        2   m2     
    3   x2   y2   z2

Requirement is to fill up column A of df2 using a function where df1.id == df2.ID , lets this function be function1

input function1(x,y,z)     output    return m1. 
input function1(x1,y1,z1)  output    return m2. 

basically I have to use the function1 to fill the column A in df2 and where df1.id == df2.ID , in the function i have to send out the values from the 3 columns of df1

I was trying like below

df2['A'] = df1.loc[df1['id'] == df2['ID'],function1(df1['A'],df1['B'],df1['C'])]

but its not working , obviously function is not designed for columns as input , any suggestions?

tripathy
  • 375
  • 2
  • 6
  • 23
  • 2
    Your question is confusing, can you please provide a sample of your expected output? This sounds like a merge or a join, or possibly `.apply()`, but it's not clear – G. Anderson Apr 15 '20 at 17:39
  • basically I have to use the function to fill the column A in df2 and where df1.id == df2.ID , in the function i have to send out the values from the 3 columns of df1 – tripathy Apr 15 '20 at 17:44

2 Answers2

0

Taking my best shot based on my understanding of your problem:

df1=pd.DataFrame({'id': {0: 1, 1: 2, 2: 3},
 'A': {0: 'x', 1: 'x1', 2: 'x2'},
 'B': {0: 'y', 1: 'y1', 2: 'y2'},
 'C': {0: 'z', 1: 'z1', 2: 'z2'}})

df2=pd.DataFrame({'ID':[1,2]})

def my_func(a,b,c):#in your case, function1
    if all('1' in i for i in (a,b,c)):
        return 'm2'
    else:
        return 'm1'

df2['A']=df1[df1['id'].isin(df2['ID'])]\#find where the id columns match
.apply(lambda x: my_func(*x[1:]), axis=1)#apply the function on the A,B,C columns of df1

df2

    ID  A
0   1   m1
1   2   m2
G. Anderson
  • 5,815
  • 2
  • 14
  • 21
  • what if df1 has more than three columns , and other columns does not matter anyway , what will be the change for this part my_func(*x[1:]) – tripathy Apr 15 '20 at 18:12
  • Whatever the slice of columns is that your function uses, that would be the slice `x[2:]`, `x[1:4]` etc. Or you can subset the dataframe first, like `df1[['A','B','C']]` – G. Anderson Apr 15 '20 at 19:00
  • its not working , function is not accepting , missing 2 required positional arguments – tripathy Apr 15 '20 at 20:41
  • It seems like you need to [edit] your original post with an actual [mcve] showing accurate sample data of your input dataframe(s), the code (or approximate code) of your `function1`, and a sample of your expected output according to [How to make good pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). It's very difficult to know how to help if we have to guess. – G. Anderson Apr 15 '20 at 20:45
0

So i solved it , I created a new column in df1

df1["NewCol"] = df1.apply(lambda x: function1(df1['A'],df1['B'],df1['C']))

and then implemented in df2

df2['A'] = df1.loc[df1['id'] == df2['ID'], "NewCol"]
tripathy
  • 375
  • 2
  • 6
  • 23