1

I am trying to merge two dataframe's together. df1 is based on a merge of other files (GTFS -routes,trips,stop_times) and df2 is the stop_times file

When I try the merge, I get a ValueError msg:

ValueError: You are trying to merge on object and int64 columns. If you wish to proceed you should use pd.concat

I want to merge the df's, and I have confirmed that the two keys I try to merge on are both string objects.

I use the following to try to merge the two df's

df3 = df1.join(df2, how='inner', on='stop_id')

I have tried writing and reading the df to file and back again, as suggested by this question, as well as casting both columns to the string object using df.stop_id = df.stop_id.astype('str')

I read in the files explicitly indicating all columns are strings

df2= pd.read_csv('stops.txt', dtype={'stop_id': 'str',
                                     'stop_code': 'str',
                                     'stop_name': 'str',
                                     'stop_lat': 'str',
                                     'stop_lon': 'str',
                                     'location_type': 'str',
                                     'parent_station': 'str',
                                     'wheelchair_boarding': 'str',
                                     'platform_code': 'str'})

and check the data types

df1.stop_id.dtype
df2.stop_id.dtype

both produce

dtype('O')

But the merge still fails with the above error. How can I resolve this?

theotheraussie
  • 495
  • 1
  • 4
  • 14

1 Answers1

1

try pd.merge to merge both dataframe

First convert stop_id in str in both dataframes

df1['stop_id'] = df1['stop_id'].astpye(str)
df2['stop_id'] = df2['stop_id'].astpye(str)

df3 = pd.merge(df1,df2,how='inner',on=['stop_id'])
tawab_shakeel
  • 3,701
  • 10
  • 26