1

I have a dataframe in the following form:

df
Text

Apple
Banana
Ananas
...

And I want to replace several strings, but some of them will have the same ouptut afterwards. So right now I am using:

df['Text'] = df['Text'].replace('Apple', 'Germany', regex=True)
df['Text'] = df['Text'].replace('Banana', 'South America', regex=True)
df['Text'] = df['Text'].replace('Ananas', 'South America', regex=True)

which leads to the desired outcome:

df
Text

Germany
South America
South America
...

But the command lines are getting some kind of messy, is there a smarter way to do it? Something like: df['Text'] = df['Text'].replace('Ananas' or 'Banana', 'South America', regex=True)

If I try, this logic: Regex match one of two words

df['Text'] = df['Text'].replace(/^(Ananas|Banana)$/', 'South America', regex=True) nothing happens

PV8
  • 5,799
  • 7
  • 43
  • 87

1 Answers1

1

Try using one-liner with dictionary:

df['Text'] = df['Text'].replace({'Apple': 'Germany', 'Banana': 'South America', 'Ananas': 'South America'}, regex=True)

And now:

print(df)

Is:

            Text
0        Germany
1  South America
2  South America
U13-Forward
  • 69,221
  • 14
  • 89
  • 114
  • thx for the solution, the ohter logic from the related thread does not work for me, this one is working and skips some lines – PV8 Dec 16 '19 at 10:26