replace() method not working on Pandas DataFrame

Question

I have looked up this issue and most questions are for more complex replacements. However in my case I have a very simple dataframe as a test dummy.

The aim is to replace a string anywhere in the dataframe with an nan, however this does not seem to work (i.e. does not replace; no errors whatsoever). I've tried replacing with another string and it does not work either. E.g.

d = {'color' : pd.Series(['white', 'blue', 'orange']),
   'second_color': pd.Series(['white', 'black', 'blue']),
   'value' : pd.Series([1., 2., 3.])}
df = pd.DataFrame(d)
df.replace('white', np.nan)

The output is still:

      color second_color  value
  0   white        white      1
  1    blue        black      2
  2  orange         blue      3

_{This problem is often addressed using inplace=True, but there are caveats to that. Please also see Understanding inplace=True in pandas.}

Reddspark · Answer 1 · 2018-09-02T19:20:13.890

161

Given that this is the top Google result when searching for "Pandas replace is not working" I'd like to also mention that:

replace does full replacement searches, unless you turn on the regex switch. Use regex=True, and it should perform partial replacements as well.

This took me 30 minutes to find out, so hopefully I've saved the next person 30 minutes.

edited Sep 02 '18 at 19:20

answered May 30 '18 at 22:59

Reddspark

6,934
9
47
64

4

Exactly what I was bumping up against for a replace failing within a delimited string. – Will B Aug 28 '18 at 14:39
5 damn years later and your answer is still accurate! This answer should receive a green tick mark. – Soren V. Raben Jul 25 '23 at 23:26

score 41 · Accepted Answer · answered Jun 02 '16 at 13:40

41

You need to assign back

df = df.replace('white', np.nan)

or pass param inplace=True:

In [50]:
d = {'color' : pd.Series(['white', 'blue', 'orange']),
   'second_color': pd.Series(['white', 'black', 'blue']),
   'value' : pd.Series([1., 2., 3.])}
df = pd.DataFrame(d)
df.replace('white', np.nan, inplace=True)
df

Out[50]:
    color second_color  value
0     NaN          NaN    1.0
1    blue        black    2.0
2  orange         blue    3.0

Most pandas ops return a copy and most have param inplace which is usually defaulted to False

answered Jun 02 '16 at 13:40

EdChum

376,765
198
813
562

12

Note that there are real bugs that result in `replace` really not working in some cases (see issue [#29813](https://github.com/pandas-dev/pandas/issues/29813)). – bluenote10 Nov 23 '19 at 07:54
2

I just ran into an issue where `df.replace(x,y, inplace=True)` did *not* work but `df = df.replace(x,y)` *did* work. `x` was an `int64` and `y` was a `float64` – MattR Dec 30 '19 at 19:40
same with @MattR. inplace not working for some reason – greendino Jan 14 '21 at 00:54
Ah - I had similar issues where inplace=True had apparenty no effect, while assigning to a new variable showed that the replacement actually worked. I was using a regex to remove whitespaces & suchs. – logicOnAbstractions Nov 23 '21 at 21:27
I believe `df.foo(x, y, inplace=True)` is now deprecated, in favor of `df = df.foo(x, y)` – jazcap53 Jan 13 '22 at 13:38

score 17 · Answer 3 · answered Dec 14 '19 at 09:47

Neither one with inplace=True nor the other with regex=True don't work in my case. So I found a solution with using Series.str.replace instead. It can be useful if you need to replace a substring.

In [4]: df['color'] = df.color.str.replace('e', 'E!')
In [5]: df  
Out[5]: 
     color second_color  value
0   whitE!        white    1.0
1    bluE!        black    2.0
2  orangE!         blue    3.0

or even with a slicing.

In [10]: df.loc[df.color=='blue', 'color'] = df.color.str.replace('e', 'E!')
In [11]: df  
Out[11]: 
    color second_color  value
0   white        white    1.0
1   bluE!        black    2.0
2  orange         blue    3.0

score 15 · Answer 4 · answered Mar 13 '21 at 16:31

You might need to check the data type of the column before using replace function directly. It could be the case that you are using replace function on Object data type, in this case, you need to apply replace function after converting it into a string.

Wrong:

df["column-name"] = df["column-name"].replace('abc', 'def')

Correct:

df["column-name"] = df["column-name"].str.replace('abc', 'def')

score 7 · Answer 5 · answered Jun 02 '16 at 13:41

7

When you use df.replace() it creates a new temporary object, but doesn't modify yours. You can use one of the two following lines to modify df:

df = df.replace('white', np.nan)
df.replace('white', np.nan, inplace = True)

answered Jun 02 '16 at 13:41

ysearka

3,805
5
20
41

score 2 · Answer 6 · answered Jun 21 '21 at 06:29

2

What worked for me was using this dict notation.

{old_value:new_value}

df.replace({10:100},inplace=True)

check the documentation for more info. https://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.DataFrame.replace.html

answered Jun 21 '21 at 06:29

sheetal_158

7,391
6
27
44

DrWhat · Answer 7 · 2022-07-14T07:25:01.373

Python 3.10, pandas 1.4.2, inplace=True did not work for below example (column dtype int32), but reassigning it did.

df["col"].replace[[0, 130], [12555555, 12555555], inplace=True)  # NOT work
df["col"] = df["col"].replace[[0, 130], [12555555, 12555555])   # worked

... and in another situation involving nans in text columns, the column needed typing in a pre-step (not just .str, as above):

df["col"].replace[["man", "woman", np.nan], [1, 2, -1], inplace=True)  # NOT work
df["col"] = df["col"].str.replace[["man", "woman", np.nan], [1, 2, -1])     # NOT work

df["col"] = df["col"].astype(str)    # needed
df["col"] = df["col"].replace[["man", "woman", np.nan], [1, 2, -1])   # worked

score 0 · Answer 8 · answered Feb 24 '22 at 04:45

0

df.replace({'white': np.nan}, inplace=True, regex=True)

answered Feb 24 '22 at 04:45

drorhun

500
7
22

score -1 · Answer 9 · answered Dec 30 '22 at 12:16

One other reason, where i faced .replace function was not working and i found the reason and fixed.

If you have the string in the column as "word1 word2", when read from excel, the space in between "word1" and "word2" has the "nbsp" meaning non blank spacing. If we replace with normal space, everything works fine. My column name is "Name"

    nonBreakSpace = u'\xa0'
    df['Name'] = df['Name'].replace(nonBreakSpace,' ',regex=True)
    df['Name']=df["Name"].str.replace("replace with","replace to",regex=True)

replace() method not working on Pandas DataFrame

9 Answers9

Linked

Related