-1

Problem that I am having: reset_index() and renaming strings inside column.

I have a dataframe and python sequence that looks like the following

from collections import Counter
import pandas as pd

df = pd.DataFrame([['Directions to Starbucks', 1045],
                   ['Show me directions to Starbucks', 754],
                   ['Give me directions to Starbucks', 612],
                   ['Navigate me to Starbucks', 498],
                   ['Display navigation to Starbucks', 376],
                   ['Direct me to Starbucks', 201],
                   ['Navigate to Starbucks', 180]],
                  columns = ['Utterance', 'Frequency'])

c = Counter()

for row in df.itertuples():
    for i in row[1].split():
        c[i] += row[2]

res = pd.DataFrame.from_dict(c, orient='index')\
                  .rename(columns={0: 'Count'})\
                  .sort_values('Count', ascending=False)

def add_combinations(df, lst):
    for i in lst:
        words = '_'.join(i)
        df.loc[words] = df.loc[df.index.isin(i), 'Count'].sum()
    return df.sort_values('Count', ascending=False)

lst = [('Give', 'Show', 'Navigate', 'Direct')]

res = add_combinations(res, lst)

This has given me the following df

                           Count
to                          3666
Starbucks                   3666
Give_Show_Navigate_Direct   2245
me                          2065
directions                  1366
Directions                  1045
Show                         754
Navigate                     678
Give                         612
Display                      376
navigation                   376
Direct                       201

However, when i tried to reset the index using reset.index(), the column name became "index", and when I tried to rename the index, I got an error message.

index                       Count
to                          3666
Starbucks                   3666
Give_Show_Navigate_Direct   2245
me                          2065
directions                  1366

Further, I'm trying to rename Give_Show_Navigate_Direct using a simple dictionary, but it looks like I can't until I fix the index/column name problem.

df['index'].replace({'Give_Show_Navigate_Direct' : 'phrasal_verbs'})
KeyError: 'index'
user_seaweed
  • 141
  • 1
  • 8
  • To make sure I'm understanding your question: are you using `reset_index` on `df` after the code you posted? – RCA Mar 27 '18 at 13:50

1 Answers1

2

You're getting an error message because the dataframe you're trying to change is not df.

You need to reset the index of res instead. Then it works fine.

res.reset_index().replace({'Give_Show_Navigate_Direct' : 'phrasal_verbs'})
RCA
  • 508
  • 4
  • 12
  • #RCA thanks! now working with bigger files and trying to import a dictionary from excel and having trouble with encoding, please let me know if you have any ideas for here https://stackoverflow.com/questions/49662807/importing-txt-file-with-dictionary-script-and-applying-it-to-dataframe-to-replac/49663391?noredirect=1#comment86367969_49663391 – user_seaweed Apr 05 '18 at 20:33