0

I have two dataframes I'd like to denormalize into one, e.g given:

>>> data1 = [['Akbar', 42], ['Akbar', 99], ['Jeff', 1]]
>>> df1 = pd.DataFrame(data1, columns = ["who", "number"])
>>> data2 = [['Akbar', "SF"], ["Jeff", "NYC"]]
>>> df2 = pd.DataFrame(data2, columns = ["who", "where"])
>>> df1
     who  number
0  Akbar      42
1  Akbar      99
2   Jeff       1
>>> df2
     who where
0  Akbar    SF
1   Jeff   NYC
>>>

I want to end up with

>>> df1
     who  number  where
0  Akbar      42     SF
1  Akbar      99     SF
2   Jeff       1    NYC

df1 has ~500k records and df2 has ~20k records so I'm looking for an efficient approach.

Any help greatly appreciated!

Kylo
  • 322
  • 1
  • 7

0 Answers0