5
In [88]: c
Out[88]: 
                       Address    Name
CustomerID                            
10            Address for Mike    Mike
11          Address for Marcia  Marcia

In [89]: c.index
Out[89]: Int64Index([10, 11], dtype='int64', name='CustomerID')

In [90]: orders
Out[90]: 
   CustomerID   OrderDate
0          10  2014-12-01
1          11  2014-12-01
2          10  2014-12-01

In [91]: orders.index
Out[91]: RangeIndex(start=0, stop=3, step=1)

In [92]: c.merge(orders)
---------------------------
MergeError: No common columns to perform merge on

So panda can't merge if index column in one dataframe has the same name as another column in a second dataframe?

DmitrySemenov
  • 9,204
  • 15
  • 76
  • 121

3 Answers3

12

You need to explicitly specify how to join the table. By default, merge will choose common column name as merge key. For your case,

c.merge(orders, left_index=True, right_on='CustomID')

Also, read the docs of pandas.DataFrame.merge please. Hope this would be helpful.

rojeeer
  • 1,991
  • 1
  • 11
  • 13
3

The join method does a left join by default (how='left') and joins on the indices of the dataframes. So set the index of the orders dataframe to CustomerId and then join.

# Create sample data.
orders = pd.DataFrame(
    {'CustomerID': [10, 11, 10],
     'OrderDate': ['2014-12-01', '2014-12-01', '2014-12-01']})    
c = pd.DataFrame(
    {'Address': ['Address for Mike', 'Address for Marcia'], 
     'Name': ['Mike', 'Marcia']},
    index=pd.Index([10, 11], dtype='int64', name='CustomerID'))

# Join.
>>> c.join(orders.set_index('CustomerID'))
                       Address    Name   OrderDate
CustomerID                                        
10            Address for Mike    Mike  2014-12-01
10            Address for Mike    Mike  2014-12-01
11          Address for Marcia  Marcia  2014-12-01

Alternatively, this merge will give you the same result. Here, you are joining on the index of c (the left dataframe) and on the CustomerID column in the right dataframe. Ensure to specify how='left' to only join items from the right dataframe to all of the records on the left (leaving an equivalent number of rows matching the length of c). The default behavior for merge is an inner join, wherebe the result only includes those records from c that find a match in orders (although this could be your desired result).

c.merge(orders, left_index=True, right_on='CustomerID', how='left')
Alexander
  • 105,104
  • 32
  • 201
  • 196
1

Try resetting the index:

c.reset_index().merge(orders)
Kyle
  • 2,814
  • 2
  • 17
  • 30