1

I am trying to merge two pd.dataframes while skipping duplicate rows defined by column "id", and on the same time keeping values from column "A" and "B" if they have the same "id".

df1 =
   id  A
0  1   valA1
1  4   valA2
2  6   valA3
3  7   valA4
4  9   valA5

df2 =
   id  B
0  1   valB1
1  5   valB2
2  6   valB3
3  8   valB4

I would like to get the following after merging:

res =
   id  A      B
0  1   valA1  valB1
1  4   valA2  NaN
2  5   NaN    valB2  
3  6   valA3  valB3
4  7   valA4  NaN
5  8   NaN    valB4
6  9   valA5  NaN

I tried using something like:

partlist = pd.concat([df1.set_index('id'), df2.set_index('id')], axis=0).reset_index()

But they dont merge, just add two each other. I cannot figure out how I can accomplish this using "pd.merge"?

Frederik Petri
  • 451
  • 8
  • 24

0 Answers0