I am trying to replace certain strings within a column in a dataframe using a txt file.
I have a dataframe that looks like the following (this is a very small version of a massive dataframe that i have).
coffee_directions_df
Utterance Frequency
Directions to Starbucks 1045
Directions to Tullys 1034
Give me directions to Tullys 986
Directions to Seattles Best 875
Show me directions to Dunkin 812
Directions to Daily Dozen 789
Show me directions to Starbucks 754
Give me directions to Dunkin 612
Navigate me to Seattles Best 498
Display navigation to Starbucks 376
Direct me to Starbucks 201
The DF shows utterances made by people and the frequency of utterances.
I.e., "Directions to Starbucks" was uttered 1045 times.
I have another DataFrame in xlsx format coffee_donut.xlsx
that I want to use to import and replace certain strings (similar to what Replace words by checking from pandas dataframe asked).
coffee_donut
Token Synonyms
Starbucks Coffee
Tullys Coffee
Seattles Best Coffee
Dunkin Donut
Daily Dozen Donut
And ultimately, I want the dataframe to look like this:
coffee_donut_df
Utterance Frequency
Directions to Coffee 1045
Directions to Coffee 1034
Give me directions to Coffee 986
Directions to Coffee 875
Show me directions to Donut 812
Directions to Donut 789
.
.
.
I followed the previous question's steps, but i got stuck at the last part:
import re
import pandas as pd
sdf = pd.read_excel('C:\coffee_donut.xlsx')
rep = dict(zip(sdf.Token, sdf.Synonyms)) #convert into dictionary
rep = dict((re.escape(k), v) for k, v in rep.iteritems())
pattern = re.compile("|".join(rep.keys()))
rep = pattern.sub(lambda m: rep[re.escape(m.group(0))], **coffee_directions_df**)
print rep
How do I apply the rep to the dataframe?? I'm so sorry if this is such a noob question. I really appreciate your help.
Thanks!!