Merging two dataframes while skipping duplicates in one column but keeping them in another

Asked Mar 13 '22 at 08:11

Active Mar 13 '22 at 08:25

Viewed 25 times

I am trying to merge two pd.dataframes while skipping duplicate rows defined by column "id", and on the same time keeping values from column "A" and "B" if they have the same "id".

df1 =
   id  A
0  1   valA1
1  4   valA2
2  6   valA3
3  7   valA4
4  9   valA5

df2 =
   id  B
0  1   valB1
1  5   valB2
2  6   valB3
3  8   valB4

I would like to get the following after merging:

res =
   id  A      B
0  1   valA1  valB1
1  4   valA2  NaN
2  5   NaN    valB2  
3  6   valA3  valB3
4  7   valA4  NaN
5  8   NaN    valB4
6  9   valA5  NaN

I tried using something like:

partlist = pd.concat([df1.set_index('id'), df2.set_index('id')], axis=0).reset_index()

But they dont merge, just add two each other. I cannot figure out how I can accomplish this using "pd.merge"?

edited Mar 13 '22 at 08:16

asked Mar 13 '22 at 08:11

Frederik Petri

Have you tried `df1.merge(df2, on='id', how='outer')` ? – Jon Clements Mar 13 '22 at 08:17

Merging two dataframes while skipping duplicates in one column but keeping them in another

0 Answers0