I have 2 dataset as dataset 1:
query category
apple {'24'}
orage {'31'}
dataset 2:
query category
apple {'24','25'}
I'm merging them and my output is
query category
apple {'24','25'}
orage {'31'}
I'm looking for an alternative approach from what I have implemented.
My approach:
df1 = pd.read_csv("data1.csv")
df2 = pd.read_csv("data2.csv")
df = pd.merge(df1, df2, on='query', how='outer')
df['category_x'] = df['category_y'].combine_first(df['category_x'])
del df["category_y"]
df.rename(columns = {'category_x':'category'}, inplace = True)
The method is working fine but is not efficient. Looking for better approaches.