0

Possible duplicate pandas - Merging on string columns not working (bug?)

I am getting interesting messages when trying to merge the following two dataframes. Does anyone know what's going on? Removing the E fixes the error.

newvi = pd.DataFrame({'a':['E1000305', 'E1000505', 'E1001071', 'E99836', 'E1003822']})
newsm = pd.DataFrame({'a':['E1000305', 'E1000305', 'E1000305', 'E1000305', 'E1000305']})

newvi.a
#0    E1000305
#1    E1000505
#2    E1001071
#3      E99836
#4    E1003822
#Name: a, dtype: object
newsm.a
#0    E1000305
#1    E1000305
#2    E1000305
#3    E1000305
#4    E1000305
#Name: a, dtype: object

newvi.join(newsm, on='a')

#my computer
#ValueError: columns overlap but no suffix specified: Index([u'a'], dtype='object')

>>> pd.__version__
u'0.18.1'

#server

#ValueError: You are trying to merge on object and int64 columns. If you wish to proceed you should use pd.concat

>>> pd.__version__
'0.24.2'
Sam Weisenthal
  • 2,791
  • 9
  • 28
  • 66

1 Answers1

0

newvi.join(newsm.set_index('a'), on='a')

Setting the index to 'a' seems to resolve your error, does the result match your expectation?

PacketLoss
  • 5,561
  • 1
  • 9
  • 27