Pandas - replacing column values

Question

I know there are a number of topics on this question, but none of the methods worked for me so I'm posting about my specific situation

I have a dataframe that looks like this:

data = pd.DataFrame([[1,0],[0,1],[1,0],[0,1]], columns=["sex", "split"])
data['sex'].replace(0, 'Female')
data['sex'].replace(1, 'Male')
data

What I want to do is replace all 0's in the sex column with 'Female', and all 1's with 'Male', but the values within the dataframe don't seem to change when I use the code above

Am I using replace() incorrectly? Or is there a better way to do conditional replacement of values?

Anand S Kumar · Accepted Answer · 2015-08-08T02:10:29.360

Yes, you are using it incorrectly, Series.replace() is not inplace operation by default, it returns the replaced dataframe/series, you need to assign it back to your dataFrame/Series for its effect to occur. Or if you need to do it inplace, you need to specify the inplace keyword argument as True Example -

data['sex'].replace(0, 'Female',inplace=True)
data['sex'].replace(1, 'Male',inplace=True)

Also, you can combine the above into a single replace function call by using list for both to_replace argument as well as value argument , Example -

data['sex'].replace([0,1],['Female','Male'],inplace=True)

Example/Demo -

In [10]: data = pd.DataFrame([[1,0],[0,1],[1,0],[0,1]], columns=["sex", "split"])

In [11]: data['sex'].replace([0,1],['Female','Male'],inplace=True)

In [12]: data
Out[12]:
      sex  split
0    Male      0
1  Female      1
2    Male      0
3  Female      1

You can also use a dictionary, Example -

In [15]: data = pd.DataFrame([[1,0],[0,1],[1,0],[0,1]], columns=["sex", "split"])

In [16]: data['sex'].replace({0:'Female',1:'Male'},inplace=True)

In [17]: data
Out[17]:
      sex  split
0    Male      0
1  Female      1
2    Male      0
3  Female      1

Using a dictionary rather than two lists feels more natural, IMHO. — DSM, Aug 08 '15 at 02:08
If I do something like that I get a `SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame`. — Stefan Falk, Sep 17 '17 at 12:50
What is your code, maybe you are slicing the original dataframe as well (thereby creating a copy) and trying to set to that? — Anand S Kumar, Sep 17 '17 at 20:13

niraj · Answer 2 · 2018-03-13T22:23:09.887

You can also try using apply with get method of dictionary, seems to be little faster than replace:

data['sex'] = data['sex'].apply({1:'Male', 0:'Female'}.get)

Testing with timeit:

%%timeit
data['sex'].replace([0,1],['Female','Male'],inplace=True)

Result:

The slowest run took 5.83 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 510 µs per loop

Using apply:

%%timeit
data['sex'] = data['sex'].apply({1:'Male', 0:'Female'}.get)

Result:

The slowest run took 5.92 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 331 µs per loop

Note: apply with dictionary should be used if all the possible values of the columns in the dataframe are defined in the dictionary else, it will have empty for those not defined in dictionary.

What exactly does the `get` do here? Could you explain that part? The code works! — ababuji, Dec 06 '18 at 20:18

score 2 · Answer 3 · answered Sep 06 '20 at 16:24

Can try this too!
Create a dictionary of replacement values.

import pandas as pd
data = pd.DataFrame([[1,0],[0,1],[1,0],[0,1]], columns=["sex", "split"])

replace_dict= {0:'Female',1:'Male'}
print(replace_dict)

Use the map function for replacing values

data['sex']=data['sex'].map(replace_dict)

Output after replacing

score 0 · Answer 4 · answered Jul 17 '22 at 11:00

You can also try using Numpy's select:

import numpy as np

data['sex'] = np.select(
    [data['sex'].eq(0), data['sex'].eq(1)], ['Female', 'Male'], default=np.nan
)

Output:

    sex     split
0   Male    0
1   Female  1
2   Male    0
3   Female  1

If neither 0 or 1 is found, NaN is returned.

score 0 · Answer 5 · edited Oct 30 '22 at 12:21

0

None of these answers worked for me but this did:

data.gender[data['gender'] == 'Male'] = 1
data.gender[data['gender'] == 'Female'] = 2

edited Oct 30 '22 at 12:21

Eric Aya

69,473
35
181
253

answered Oct 30 '22 at 05:33

REnuka Perera

53
5

Pandas - replacing column values

5 Answers5

Linked

Related