I have a 2 dataframe, in first - columns, where I should find some info second - column, what I should find in first dataframe and columns, what should I add if string from first column contain.
df1:
id url
111 vk.com/audio
222 twitter.com/chats
df2:
url Maincategory Subcategory
vk.com Social Network entertainment
twitter.com Social Network entertainment
If url column were match, I would use
df1['Main Category'] = df1.url.map(df2.set_index('url')['Maincategory'])
But it doesn't work to find substring. I use for that
mapping = dict(df2.set_index('url')['Maincategory'])
def map_to_substring(x):
for key in mapping.keys():
if key in x:
return mapping[key]
return 'None'
But if df is too much, it takes too much time. How can I improve this approach to do it faster?