0

I'm running into a wall in terms of how to do this with Pandas. Given a dataframe (df1) with an ID column, and a separate dataframe (df2), how can I combine the two to make a third dataframe that preserves the ID column with all the possible combinations it could have?

df1

ID   name.x
1    a
2    b
3    c

df2

name.y
l
m

dataframe creation:

df1 = pd.DataFrame({'ID':[1,2,3],'name.x':['a','b','c']})
df2 = pd.DataFrame({'name.y':['l','m']})

combined df

ID   name.x name.y
1    a      l 
1    a      m    
2    b      l
2    b      m
3    c      l 
3    c      m  
Jacob Kearney
  • 391
  • 11
Benjamin
  • 683
  • 1
  • 8
  • 22

1 Answers1

1

create a col on each that is the same, do a full outer join, then keep the cols you want:

df1 = pd.DataFrame({'ID':[1,2,3],'name.x':['a','b','c']})
df2 = pd.DataFrame({'name.y':['l','m']})

df1['join_col'] = True
df2['join_col'] = True

df3 = pd.merge(df1,df2, how='outer',on = 'join_col')

print(df3[['ID','name.x','name.y']])

will output:

   ID name.x name.y
0   1      a      l
1   1      a      m
2   2      b      l
3   2      b      m
4   3      c      l
5   3      c      m
Jacob Kearney
  • 391
  • 11