0

I have 2 dataset as dataset 1:

query category
apple  {'24'}
orage  {'31'}

dataset 2:

query category
apple  {'24','25'}

I'm merging them and my output is

query category
apple  {'24','25'}
orage  {'31'}

I'm looking for an alternative approach from what I have implemented.
My approach:

df1 = pd.read_csv("data1.csv")
df2 = pd.read_csv("data2.csv")
df = pd.merge(df1, df2, on='query', how='outer')
df['category_x'] = df['category_y'].combine_first(df['category_x'])
del df["category_y"]
df.rename(columns = {'category_x':'category'}, inplace = True)

The method is working fine but is not efficient. Looking for better approaches.

0 Answers0