0

I have 3 pandas dataframe, each having different number of rows and some similar columns, I need to merge all with all data

mydata = [0]*3
dataA = {'First':  [500],'Second': ['Sone']}
mydata[0] = pd.DataFrame(dataA,columns=['First','Second'])
dataB = {'First':  [500,500],'Third': [0.5,0.6]}
mydata[1] = pd.DataFrame(dataB,columns=['First','Third'])
dataC = {'First':  [500,500,500],'Fourth': ['Fone', 'Ftwo','Fthree'],'Fifth': [23, 24, 25]}
mydata[2] = pd.DataFrame(dataC,columns=['First','Fourth','Fifth'])

Merged data looks like

merge_data = {'First':  [500,500,500,500,500,500],'Second': ['Sone','Sone','Sone','Sone','Sone','Sone'],'Third': [0.5,0.6,0.5,0.6,0.5,0.6],'Fourth': ['Fone', 'Fone', 'Ftwo', 'Ftwo', 'Fthree','Fthree'],'Fifth': [23, 23, 24, 24, 25, 25]}
merge_df = pd.DataFrame(merge_data,columns=['First','Second','Third','Fourth','Fifth'])

data append produces Nan rows

merge_data = mydata[0].copy()
for i in np.arange(1, len(mydata)):
    merge_data = merge_data.append(mydata[i], sort=False)

and merge losing the rows

merge_data = pd.merge(mydata[0], mydata[1], left_index=True, right_index=True) 

Is it possible to combine as merged_df

hanzgs
  • 1,498
  • 17
  • 44
  • 1
    Use `from functools import reduce merged_df = reduce(lambda left,right: pd.merge(left,right,on=['First'], how='outer'), mydata) print (merged_df)` – jezrael May 06 '20 at 07:19

1 Answers1

1

You must merge on the 'First' column:

pd.merge(mydata[0], mydata[1], on='First').merge(mydata[2], on='First')

to get:

   First Second  Third  Fourth  Fifth
0    500   Sone    0.5    Fone     23
1    500   Sone    0.5    Ftwo     24
2    500   Sone    0.5  Fthree     25
3    500   Sone    0.6    Fone     23
4    500   Sone    0.6    Ftwo     24
5    500   Sone    0.6  Fthree     25

Simply the Fourth and Fifth columns are still aligned here while there are not in your merge_df dataframe...

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
  • 1
    Or check dupe for `from functools import reduce merged_df = reduce(lambda left,right: pd.merge(left,right,on=['First'], how='outer'), mydata) print (merged_df)` – jezrael May 06 '20 at 07:19
  • These both are awesome solutions, Great, Thanks – hanzgs May 06 '20 at 07:21