I have two dataframes I'd like to denormalize into one, e.g given:
>>> data1 = [['Akbar', 42], ['Akbar', 99], ['Jeff', 1]]
>>> df1 = pd.DataFrame(data1, columns = ["who", "number"])
>>> data2 = [['Akbar', "SF"], ["Jeff", "NYC"]]
>>> df2 = pd.DataFrame(data2, columns = ["who", "where"])
>>> df1
who number
0 Akbar 42
1 Akbar 99
2 Jeff 1
>>> df2
who where
0 Akbar SF
1 Jeff NYC
>>>
I want to end up with
>>> df1
who number where
0 Akbar 42 SF
1 Akbar 99 SF
2 Jeff 1 NYC
df1 has ~500k records and df2 has ~20k records so I'm looking for an efficient approach.
Any help greatly appreciated!