I have data frame like this:
**Domain** **URL**
Amazon amazon.com/xyz/butter
Amazon amazon.com/xyz/orange
Facebook facebook.com/male
Google google.com/airport
Google goolge.com/car
Its just an imaginary data. I have clickstream data where i want to use "Domain" and "URL" columns. Actually i have list of many keyword which i saved in dictionary and i need to search it in url and then extract it to create new column.
I have dictionary like this:
dict_keyword = {'Facebook': ['boy', 'girl', 'man'], 'Google': ['airport', 'car', 'konfigurator'], 'Amazon': ['apple', 'orange', 'butter']
I want to obtain output like this:
**Domain** **URL** Keyword
Amazon amazon.com/xyz/butter butter
Amazon amazon.com/xyz/orange orange
Facebook facebook.com/male male
Google google.com/airport airport
Google goolge.com/car car
So far i want to do just with one line of code. I am trying to use
df['Keyword'] = df.apply(lambda x: any(substring in x.URL for substring in dict_config[x.Domain]) ,axis =1)
I am getting only Boolean value but i want to return the keyword. Any help?