0

I am trying to replace certain strings within a column in a dataframe using a txt file.

I have a dataframe that looks like the following.

coffee_directions_df

Utterance                         Frequency   

Directions to Starbucks           1045
Directions to Tullys              1034
Give me directions to Tullys      986
Directions to Seattles Best       875
Show me directions to Dunkin      812
Directions to Daily Dozen         789
Show me directions to Starbucks   754
Give me directions to Dunkin      612
Navigate me to Seattles Best      498
Display navigation to Starbucks   376
Direct me to Starbucks            201

The DF shows utterances made by people and the frequency of utterances.

I.e., "Directions to Starbucks" was uttered 1045 times.

I understand that I can create a dictionary to replace strings such as "Starbucks", "Tullys", and "Seattles Best" such as the following:

# define dictionary of mappings
rep_dict = {'Starbucks': 'Coffee', 'Tullys': 'Coffee', 'Seattles Best': 'Coffee'}

# apply substring mapping

df['Utterance'] = df['Utterance'].replace(rep_dict, regex=True).str.lower()

However, my dataframe is pretty big, and I am wondering if there is a way where I can save rep_dict as a .txt file, import the .txt file, and apply or map that the words in that txt file to coffee_directions_df.Utterance

Ultimately, I don't want to create a bunch of dictionaries within the script and be able to import a txt file that contains these dictionaries.

Thanks!!

user_seaweed
  • 141
  • 1
  • 8
  • You could just use pandas to import the file. Or am I missing something? – Anton vBR Apr 04 '18 at 17:35
  • Possible duplicate of [Replace words by checking from pandas dataframe](https://stackoverflow.com/questions/41834274/replace-words-by-checking-from-pandas-dataframe) – cwallenpoole Apr 04 '18 at 17:41
  • @user_seaweed, I suggest you provide feedback on previous answers. Either upvote, downvote, accept, or comment. This helps you, the answerer and the community. – jpp Apr 04 '18 at 17:50

1 Answers1

1

I mean something simple as this:

import pandas as pd

data = '''\
Starbucks,Coffee
Tullys,Coffee
Seattles Best,Coffee'''

# Create a map from a file 
m = pd.read_csv(pd.compat.StringIO(data), header=None, index_col=[0])[1]

And then:

df['Utterance'] = df['Utterance'].replace(m, regex=True).str.lower()
Anton vBR
  • 18,287
  • 5
  • 40
  • 46
  • if i had a txt file that looked like data, would i be able to import that file and apply it to my dataframe? (basically looking at a massive csv files with thousands of rows, and there is going to be hundreds of replacements i'll be doing, which is why i want to import a txt file instead of building a dictionary in the shell/terminal). thanks for all of your help! – user_seaweed Apr 05 '18 at 01:10
  • @user_seaweed Try it with a small file! – Anton vBR Apr 05 '18 at 08:06