5

I have two dataframes, I wanted to merge them into one single dataframe based on the matching row. My dataframe looks like this

DF_1

Set_1   Fax_1   Fax_2
Abc_1   45  76
Abc_2   46  77
Abc_3   47  78
Abc_4   48  79
Abc_5   49  80
Abc_6   50  81
Abc_7   51  82
Abc_8   52  83
Abc_9   53  84
Abc_10  54  85

df_2

Set_1   Fax_3   Fax_4
Abc_1   69  42
Abc_2   70  43
Abc_3   71  44
Abc_6   72  45
Abc_5   73  46
Abc_6   74  47
Abc_7   75  48
Abc_8   76  49
Abc_9   77  50
Abc_10  78  51
Abc_11  55  86
Abc_12  56  87
Abc_13  57  88
Abc_14  58  89
Abc_15  59  90
Abc_16  60  91

The second one is a bigger dataframe and what I need as in my output file, as,

Set_1   Fax_1   Fax_2   Fax_3   Fax_4
Abc_1   45  76  69  42
Abc_2   46  77  70  43
Abc_3   47  78  71  44
Abc_4   48  79  72  45
Abc_5   49  80  73  46
Abc_6   50  81  74  47
Abc_7   51  82  75  48
Abc_8   52  83  76  49
Abc_9   53  84  77  50
Abc_10  54  85  78  51

This is what I tried with merge,

merged =df.merge(df_annon, on='Set_1')
merged.head()

But it is just giving me the header as output. Any help or guidance is much appreciated..!!

ARJ
  • 2,021
  • 4
  • 27
  • 52
  • `pd.merge` takes two keyword arguments `left_index=True` and `right_index=True`, this merges on the index, is this what you want perhaps? – firelynx Aug 05 '15 at 16:23
  • Thank you and that worked, merged = df.merge(df_annon, how='inner',left_index=True,right_index=True, on='Set_1) – ARJ Aug 05 '15 at 16:27
  • 1
    You can also use `join` which is basically the same as `merge` but automatically uses the index. – JohnE Aug 05 '15 at 16:59
  • `df_2` has two values for `Abc_6`. How do you want to choose between them? You could either have `72 45` or `74 47`. The example you gave gives the latter but why? – LondonRob Aug 05 '15 at 17:40
  • Also, your example DataFrames don't have `Set_1` as the index, and if you run your example `merge` statement, you raise an exception, so something's odd.... – LondonRob Aug 05 '15 at 17:44
  • possible duplicate of [pandas joining multiple dataframes on columns](http://stackoverflow.com/questions/23668427/pandas-joining-multiple-dataframes-on-columns) – LondonRob Aug 05 '15 at 17:44
  • 1
    @LondonRob At least half of the pandas questions are duplicates nowadays. ;-) – JohnE Aug 05 '15 at 17:47
  • this is a dummy data frame I made quickly to post the question ..Sorry for that .. – ARJ Aug 06 '15 at 11:06

2 Answers2

12

Try this

merged = df.merge(df_annon, left_index=True, right_index=True, how='inner')
Pawel Gradecki
  • 3,476
  • 6
  • 22
  • 37
Tanmoy
  • 789
  • 7
  • 14
8

I guess I wrote the answer in a comment already, but let me elaborate.

The pandas merge function takes keyword arguments: left_index= and right_index=. When set to True, the merge function will use the index/indicies of the dataframe(s) for merging.

Like this:

merged = pd.merge(left=df, left_index=True
                  right=df_annon, right_index=True,
                  how='inner')
firelynx
  • 30,616
  • 9
  • 91
  • 101
  • Sorry. but with this solution its print all the columns not base don matching ones.. – ARJ Aug 07 '15 at 11:13