0

i have 2 dataframes with 10k records. both have a column ['TITLE ID'] in simillar in both, i want to merge both based on Title column without losing any information from both dataframes

Paul H
  • 65,268
  • 20
  • 159
  • 136
  • Show us sample of the dataframes, and give us MRE with `df1.head(10).to_dict()` and same on other. – Wasif Nov 30 '20 at 16:53

2 Answers2

0

Use pandas.merge

import pandas as pd

df1 = pd.DataFrame({'A':[1,2,4],
                   'B':[2,5,2],
                    'TitleID':[1,2,3]})
df2 = pd.DataFrame({'C':[7,3,6],
                    'D':[2,6,3],
                    'TitleID':[3,2,1]})
df = pd.merge(df1, df2, left_on='TitleID', right_on='TitleID')
print(df)

Output:

   A  B  TitleID  C  D
0  1  2        1  6  3
1  2  5        2  3  6
2  4  2        3  7  2

For more extensive explanation on merging data frames, look into this: https://stackoverflow.com/a/53645883/7623492

Mark
  • 532
  • 2
  • 6
0

This will get all the data from both dataframes. You can replace NaN using df.fillna(0)

Code:

import pandas as pd

df1 = pd.DataFrame({'A':[1,2,4,6],
                   'B':[2,5,2,8],
                    'TitleID':[1,2,3,7]})
df2 = pd.DataFrame({'C':[7,3,6,5],
                    'D':[2,6,3,7],
                    'TitleID':[3,2,1,4]})
df = pd.merge(df1, df2, left_on='TitleID', right_on='TitleID', how='outer')
print(df)

Output:

     A    B  TitleID    C    D
0  1.0  2.0        1  6.0  3.0
1  2.0  5.0        2  3.0  6.0
2  4.0  2.0        3  7.0  2.0
3  6.0  8.0        7  NaN  NaN
4  NaN  NaN        4  5.0  7.0
Aaj Kaal
  • 1,205
  • 1
  • 9
  • 8