1

I want to combine 2 similar dataframes. I I have checked several websites but couldn't find an answer to my question.

df1 = DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
                 'B': ['B0', 'B1', 'B2', 'B3'],
                 'C': ['C0', 'C1', 'C2', 'C3'],
               index=[0, 1, 2])   
df2 = DataFrame({'A': ['A0', 'A1', 'A4', 'A3'],
                 'B': ['B0', 'B1', 'B4', 'B3'],
                 'D': ['D0', 'D1', 'D4', 'D3']},
               index=[0, 1, 2])

I want to have

    df3 = DataFrame({'A': ['A0', 'A1', 'A3'],
                     'B': ['B0', 'B1', 'B3'],
                     'C': ['C0', 'C1', 'C3'],
                     'D': ['D0', 'D1', 'D3'].
                     index=[0, 1, 2, 3])   

Essentially I combine 2 dataframes, adding column D to the first dataframe. But I omit any rows that won't have values for both C and D, like row 2 and 4. I have tried append and concat but it just gives me all the columns and all the rows stacked on top of each other.

Thanks!

AK9309
  • 761
  • 3
  • 13
  • 33

1 Answers1

1

Just do a default merge this will perform an inner join on common columns:

In [80]:

df1.merge(df2)
Out[80]:
    A   B   C   D
0  A0  B0  C0  D0
1  A1  B1  C1  D1
2  A3  B3  C3  D3
EdChum
  • 376,765
  • 198
  • 813
  • 562
  • Thank you. That's exactly what I needed. One more question: when I input actual column names versus ABCD the columns in the output data frame is always in the alphabetical order, no matter I what order I input them. So a column named apple will always be before column banana even if I input apple after banana – AK9309 Sep 01 '15 at 15:14
  • If you're asking how to reorder the columns you can pass a list of the desired order: `df1.merge(df2).ix[:, ['A','D','B','C']]` will reorder the columns, generically doing `df.ix[:, col_list]` where `col_list` is a list of columns will reorder the column order – EdChum Sep 01 '15 at 15:18