0

While merging 2 DataFrames with following command:

df = pd.merge(df,LFA1, left_on='Vendor', right_index=True, how='left')

... and got the infamous message: "You are trying to merge on object and int64 columns.". It appears that the most probable cause is that the left or right key type is 'int'.

I found two methods to force the types of the keys to str: a. when creating the DataFrame:

LFA1= pd.read_excel(r'G:\FAIA 2018\Extracts\LFA1(Full).xlsx',converters={'Vendor':str})

b. after the creation of the DataFrame:

LFA1['Vendor']=LFA1['Vendor'].astype(str)

Even if I ask for the type right after the command:

df.types()
LFA1.types()

the column 'Vendor' stays 'object', but never shows 'str'.

I believe that this is the root cause behind the message, but none of those methods succeeded apparently in switching the type to string.

I either misunderstood or missed a step...

JCF
  • 307
  • 2
  • 17
  • Possible duplicate of [Strings in a DataFrame, but dtype is object](https://stackoverflow.com/questions/21018654/strings-in-a-dataframe-but-dtype-is-object) – Peter Leimbigler Oct 18 '18 at 18:34

1 Answers1

1

pandas stores strings in object fields, so that's not the problem. The issue is that in the merge, you're specifying left_on='Vendor', right_index=True, which means that the merge is trying to match df['Vendor'] with LFA1.index (which is likely just the row autonumbers).

What you want is either left_on='Vendor', right_on='Vendor' or just the shorter on='Vendor' since it looks like the field is named the same on both frames.

Randy
  • 14,349
  • 2
  • 36
  • 42
  • This is puzzling. I set the key 'Vendor' as index for the LFA1 DataFrame with LFA1.set_index(['Vendor']). It's located in the code right before LFA1['Vendor']=LFA1['Vendor'].astype(str). Now, the question is whether you can set the type of the key you define as index. Or, does the key loses the type when it's defined as index. That would explain the error message, then. – JCF Oct 18 '18 at 20:29
  • `.set_index()` doesn't modify the original dataframe. You'd need to do `LFA = LFA1.set_index('Vendor')` – Randy Oct 18 '18 at 23:34
  • OK. If it's not inplace, then it makes totally sense. Thx for this. – JCF Oct 19 '18 at 07:45