0

I have a dataframe which has a lot of characters that need changing.

I can do this line by line but I couldn't figure out how to loop through these characters to replace with a new character.

This is my code so far:

df_media_input['MediaChannel']=df_media_input['MediaChannel'].map(lambda x: str.replace(x, "Direct Mail","DM"))
df_media_input['MediaChannel']=df_media_input['MediaChannel'].map(lambda x: str.replace(x, "DR TV","DRTV"))
df_media_input['MediaChannel']=df_media_input['MediaChannel'].map(lambda x: str.replace(x, "Affilliates","Affiliates"))
df_media_input['MediaChannel']=df_media_input['MediaChannel'].map(lambda x: str.replace(x, "DRTV","TV"))
df_media_input['MediaChannel']=df_media_input['MediaChannel'].map(lambda x: str.replace(x, "All Time TV","TV"))
df_media_input['MediaChannel']=df_media_input['MediaChannel'].map(lambda x: str.replace(x, "Peak TV","TV"))
df_media_input['MediaChannel']=df_media_input['MediaChannel'].map(lambda x: str.replace(x, "Regional Press","Press"))
df_media_input['MediaChannel']=df_media_input['MediaChannel'].map(lambda x: str.replace(x, "National Press","Press"))

but I feel something like this should be possible:

dic= {Direct Mail:DM}


for i and j in dic:

df_media_input['MediaChannel']=df_media_input['MediaChannel'].map(lambda x: str.replace(x, i,j))

where Direct Mail is i and DM is j

fred.schwartz
  • 2,023
  • 4
  • 26
  • 53

3 Answers3

2

First create dictionary for replacing:

d = {"Direct Mail":"DM", 
     "DR TV":"DRTV",
     ...}

If want replace substrings use replace with regex=True:

df_media_input['MediaChannel'] = df_media_input['MediaChannel'].replace(d, regex=True)

If want replace values faster way use map with fillna:

df_media_input['MediaChannel'] = df_media_input['MediaChannel'].map(d)
                                     .fillna(df_media_input['MediaChannel'])

Check difference in sample:

df_media_input = pd.DataFrame({'MediaChannel':['Direct Mail','DR TV new','val']})
print (df_media_input)
  MediaChannel
0  Direct Mail
1    DR TV new
2          val

d = {"Direct Mail":"DM", "DR TV":"DRTV"}


df_media_input['MediaChannel1'] = df_media_input['MediaChannel'].replace(d, regex=True)

df_media_input['MediaChannel2'] = (df_media_input['MediaChannel'].map(d)
                                     .fillna(df_media_input['MediaChannel']))
print (df_media_input)
  MediaChannel MediaChannel1 MediaChannel2
0  Direct Mail            DM            DM
1    DR TV new      DRTV new     DR TV new
2          val           val           val
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
2

The Pandas DataFrame replace methods accepts a dictionary where the keys correspond to existing strings and the values correspond to the string to replace each with.

So in your example:

replacements = {
    "Direct Mail": "DM",
    "DR TV": "DRTV",
    # and so on...
}

df_media_input['MediaChannel'].replace(replacements, inplace=True)

Assuming that the values in the 'MediaChannel' column are just the strings to replace and do not contain those strings. For example, "Direct Mail" will be changed to "DM", however "I like Direct Mail" will not be changed to "I like DM". To handle this case with substrings, you'll need to set the regex keyword argument of replace to True.

Henry Woody
  • 14,024
  • 7
  • 39
  • 56
  • @fred.schwartz - If performance is important, rather use [map + fillna](https://stackoverflow.com/q/49259580/2901002) – jezrael Nov 07 '18 at 10:33
2

Since you need to iterate, you could do something like this.

for i in range(len(df)):

    d = {"Direct Mail":"DM",
         "DR TV":"DRTV",
         "DRTV":"TV",
         "All Time TV":"TV",
         "Peak TV":"TV",
         "Regional Press":"Press",
         "National Press":"Press"
     }
    for x,y in d.items(): 
        df['MediaChannel'].values[i] = df['MediaChannel'].values[i].replace(x, y)
Chandu
  • 2,053
  • 3
  • 25
  • 39