10

I have two DataFrames objects whose columns are as below

Dataframe 1:

df.dtypes

Output:

ImageID       object
Source        object
LabelName     object
Confidence     int64
dtype: object

Dataframe 2:

a.dtypes

Output:

LabelName       object
ReadableName    object
dtype: object

Here, i am trying to combine these two dataframes as below

combined =  df.join(a,on='LabelName')

But, i am getting the following error

ValueError: You are trying to merge on object and int64 columns. If you wish to proceed you should use pd.concat

But, i am merging them on LabelName, which has only strings (object datatype)

Am i missing something here?

rawwar
  • 4,834
  • 9
  • 32
  • 57

3 Answers3

24

About the on parameter, the documentation says:

Column or index level name(s) in the caller to join on the index in other, otherwise joins index-on-index.

Note that join() always uses other.index. You can try this:

df.join(a.set_index('LabelName'), on='LabelName')

Or use df.merge() instead.

John Zwinck
  • 239,568
  • 38
  • 324
  • 436
2

There is problem some columns are integers along with string in DataFrame1 while all are strings in DataFrame2 which is causing the problem.

Simplest solution is cast all columns to strings:

pd.merge(df1.astype(str),df2.astype(str), how='outer')

As the Value Error suggesting itself use concat:

pd.concat([df1, df2])
Karn Kumar
  • 8,518
  • 3
  • 27
  • 53
0

Try converting the Confidence column to an object first because there is a dtype mismatch.

 df['Confidence'].apply(str)