0

Let's say df1 looks like:

id  x
a   1
b   2
b   3
c   4

and df2 looks like:

id  y
b   9
b   8

How do I merge them so that the output is:

id  x  y
b   2  9
b   3  8

I've tried pd.merge(df1, df2, on='id') but it is giving me:

id  x   y
b   2   9
b   2   8
b   3   9
b   3   8

which is not what I want.

Chiliboo
  • 57
  • 1
  • 4

1 Answers1

2

IIUC, GroupBy.cumcount + merge

new_df = (df1.assign(count=df1.groupby('id').cumcount())
             .merge(df2.assign(count=df2.groupby('id').cumcount()),
                     on=['id', 'count'], how='inner')
             .drop(columns='count'))

  id  x  y
0  b  2  9
1  b  3  8
ansev
  • 30,322
  • 5
  • 17
  • 31