I have the following two datasets - a dataset with text:
text = {'Text':[['Nike', 'invests', 'in', 'shoes'], ['Adidas', 'invests', 'in', 't-shirts']]}
text_df = pd.DataFrame(text)
text_df
and a dataset with words and respective scores and topics.
points = {'Text':['invests', 'shoes', 'Adidas'], 'Score':[1, 2, 1], 'Topic':['not_name', 'not_name', 'name' ] }
points_df = pd.DataFrame(points)
points_df
For each row in the text dataset I would like to see if the word exists and, if the word is there, create a column named after the category and create a new list with the score for the relevant word. In case the word is not there, assign a zero.
This is the outcome
text_results = {'Text':[['Nike', 'invests', 'in', 'shoes'], ['Adidas', 'invests', 'in', 't-shirts']], 'not_name': [[0, 1, 0, 2], [0, 1, 0, 0]], 'name': [[0, 0, 0, 0], [1, 0, 0, 0]]}
results_df = pd.DataFrame(text_results)
results_df
Any suggestions? I am a bit lost at sea!