1

Raw dataframe:

A                 B
hello world       a
say hello         a
I try to do it    a
We say            a
like saying hello a

Expectation

For column A, 'world' is replaced by 'a','do' is replaced by 'finish','say' is replaced by 'guess'.

Trying

df['A'].str.replace('world','a').str.replace('do','finish').str.replace('say','guess')

It's done but it's a long code and very inefficient, especially dealing with many strings( >100 ).

Hope

A more pretty and concise way to replace multiple strings in pandas.

Jack
  • 1,724
  • 4
  • 18
  • 33

1 Answers1

1
rep_dict = dict([
        ('world', 'a'), ('do', 'finish'), ('say', 'guess')
    ])

df.replace(rep_dict, regex=True)

                     A  B
0              hello a  a
1          guess hello  a
2   I try to finish it  a
3             We guess  a
4  like guessing hello  a
piRSquared
  • 285,575
  • 57
  • 475
  • 624
  • great,but what if I have 100 strings to replace. – Jack May 05 '17 at 09:47
  • Then you create a dictionary with 100 words and their replacement values. Is there something that I'm missing? – piRSquared May 05 '17 at 09:49
  • Sorry. My point is how to create 100 pair strings efficiently into a dict but I know how to do it after thinking a while. For example, write those strings into a file and read it into a dictionary. – Jack May 05 '17 at 09:53
  • How are your pair strings stored? Does my latest edit help? – piRSquared May 05 '17 at 09:54
  • How about this method? http://stackoverflow.com/questions/4803999/python-file-to-dictionary – Jack May 05 '17 at 09:56
  • So your saying that your string pairs are in a file? Each line of the file constitutes a pair separated by a space? You need to update your question with this information. This fundamentally changes the question. – piRSquared May 05 '17 at 09:58
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/143487/discussion-between-jack-and-pirsquared). – Jack May 05 '17 at 10:00
  • Not exactly. The method you supply is using dict and dataframe.str.replace(dict, regex=True). But there are many ways to build dict. The orginal question including the situation when there are multiple strings. I think your answer works well for less than 10 strings but not work well for more than 10. But I have update question to let others know what I want to do. – Jack May 05 '17 at 10:06
  • 1
    In order to perform this task, you need to supply the strings in some form. A file? Database? Another dataframe? It has to come from somewhere. You haven't stated where from. I can't magically tell you how to create a pretty replace methodology if you won't tell me where these strings are. As it is, I've answered your exact question and you are adding details after the fact. It is much better to include all details that you need addressed in the original question so we(I) don't end up wasting time. – piRSquared May 05 '17 at 10:09