0

I have two pandas dataframes as below where col1, col2 and col3 is same in both the dataframes. I want to concatenate the two dataframes in such a way that the new dataframes has col4 and col5 added without duplicating the rows.

df1 = pd.DataFrame(data = {'col1' : [1, 2, 3],
                           'col2' : [10, 11, 12], 'col3' : [1, 1, 2], 'col4' : [100, 200, 300]})

df1

   col1  col2  col3  col4
0     1    10     1   100
1     2    11     1   200
2     3    12     2   300

df2 = pd.DataFrame(data = {'col1' : [1, 2, 3],
                           'col2' : [10, 11, 12], 'col3' : [1, 1, 2], 'col5' : [20, 40, 60]})

   col1  col2  col3  col5
0     1    10     1    20
1     2    11     1    40
2     3    12     2    60

My expected output dataframe:

   col1  col2  col3   col4  col5
0     1    10     1  100.0   20
1     2    11     1  200.0   40
2     3    12     2  300.0   60

I tried the below code, but it duplicates the rows as well as shown below.

merge = pd.concat([df1, df2], axis=0, ignore_index=True)

   col1  col2  col3   col4  col5
0     1    10     1  100.0   NaN
1     2    11     1  200.0   NaN
2     3    12     2  300.0   NaN
3     1    10     1    NaN  20.0
4     2    11     1    NaN  40.0
5     3    12     2    NaN  60.0
Shanoo
  • 1,185
  • 1
  • 11
  • 38

1 Answers1

1

Use DataFrame.merge:

df_merge=df1.merge(df2)
print(df_merge)

   col1  col2  col3  col4  col5
0     1    10     1   100    20
1     2    11     1   200    40
2     3    12     2   300    60
ansev
  • 30,322
  • 5
  • 17
  • 31