1

I have 2 DataFrames:

df1 = pd.DataFrame({'code': ['11', '12', '13', '14'],
                    'name': ['a', 'a', 'b', 'c']})

df2 = pd.DataFrame({'code': ['15', '16', '17', '18', '19', '20'],
                    'name': ['a',   'a', 'b',  'c',  'c',  'c']})

I need to build a matrix that will consist of all pairs of codes of the same names from two DataFrames. The matrix should look like this:

pairs  value from df1     value from df2
a-a       11                15
a-a       11                16
a-a       12                15
a-a       12                16
b-b       13                17
c-c       14                18
c-c       14                19
c-c       14                20

Appreciate any help on this

PasDeSence
  • 39
  • 4

1 Answers1

1

Use DataFrame.merge with DataFrame.insert:

df = df1.merge(df2, on='name', suffixes=(' from df1',' from df2'))
df.insert(0, 'pairs', df['name'] + '-' + df.pop('name'))
print (df)
  pairs code from df1 code from df2
0   a-a            11            15
1   a-a            11            16
2   a-a            12            15
3   a-a            12            16
4   b-b            13            17
5   c-c            14            18
6   c-c            14            19
7   c-c            14            20
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252