16

How can I replace the data 'Beer','Alcohol','Beverage','Drink' with only 'Drink'.

df.replace(['Beer','Alcohol','Beverage','Drink'],'Drink')

doesn't work

cs95
  • 379,657
  • 97
  • 704
  • 746
Jiayang Zhuo
  • 217
  • 2
  • 3
  • 6
  • 3
    If your question was answered, please reward the efforts of those who helped you. Accept the most helpful answer. You can also upvote all other useful answers. Please make it a habit, it helps the community. – cs95 Oct 26 '17 at 16:58
  • This code does work in pandas now, see example 3 on https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.replace.html – 00schneider Apr 23 '20 at 08:58

6 Answers6

26

You almost had it. You need to pass a dictionary to df.replace.

df

       Col1
0      Beer
1   Alcohol
2  Beverage
3     Drink

df.replace(dict.fromkeys(['Beer','Alcohol','Beverage','Drink'], 'Drink'))

    Col1
0  Drink
1  Drink
2  Drink
3  Drink

This works for exact matches and replacements. For partial matches and substring matching, use

df.replace(
    dict.fromkeys(['Beer','Alcohol','Beverage','Drink'], 'Drink'), 
    regex=True
)

This is not an in-place operation so don't forget to assign the result back.

cs95
  • 379,657
  • 97
  • 704
  • 746
  • 2
    Ha! I was going to `df.replace({k: 'Drink' for k in ['Beer','Alcohol','Beverage','Drink']})` But instead, I learned about `dict.fromkeys` – piRSquared Oct 24 '17 at 21:59
  • Won't passing `inplace=True` to df.replace make it an in-place operations, if desired? – 2Toad Jun 10 '20 at 05:02
  • 1
    @2Toad it would but I don't recommend it: https://stackoverflow.com/a/60020384/4909087 – cs95 Jun 10 '20 at 05:16
6

Try the following approach:

lst = ['Beer','Alcohol','Beverage','Drink']
pat = r"\b(?:{})\b".format('|'.join(lst))

df = df.replace(pat, 'Drink', regexp=True)
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
  • Nice one , just do not know why yours and mine are different but can get the same result ...Sorry I am not quit work with strings...can you refer some link for future studying – BENY Oct 24 '17 at 22:46
4

Looks like different from MaxU's solution :)

df.replace({'|'.join(['Beer','Alcohol','Beverage','Drink']):'Drink'},regex=True)
BENY
  • 317,841
  • 20
  • 164
  • 234
3

It seems that your initial method of doing it works in the the latest iteration of Python.

df.replace(['Beer','Alcohol','Beverage','Drink'],'Drink', inplace=True)

Should work

SuhailY
  • 69
  • 2
2

Slight change in earlier answers: Following code Replacing values of specific column/Columns

df[['Col1']] = df[['Col1']].replace(dict.fromkeys(['Beer','Alcohol','Beverage','Drink'], 'Drink'))
0

Not relevant to your exact case using strings, but relevant to the question as stated in the title:

If you have a range of numbers you would like to replace you can provide range() as a dict key.

# Dymmy dataframe with two columns
df = pd.DataFrame(dict(a=[1, 2, 3, 4, 5], b=[66, 44, 33, 22, 77]))

# Mapping that will replace 1 with 'x', 2, 3, 4 with 'y' and 5 with 'z' in column 'a'
mapping = {"a": {1: "x", range(2, 5): "y", 5: "z"}}
df.replace(mapping)
Saaru Lindestøkke
  • 2,067
  • 1
  • 25
  • 51