Pandas Series.replace replaces the entire series even in an iterative loop?

Question

So I have this dataframe and I wanna replace some of its rows with another value based on a condition.

df = pd.Dataframe({'col1':[1,1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3]})
for rows in df['col1']:
    if rows == "1":
        df['col1'].replace({rows: "A"}, inplace=True)
    else:
        df['col1'].replace({rows: "BC"}, inplace=True)

However, the results are weird:

>>> print(df)
    col1
0    BC
1    BC
2    BC
3    BC
4    BC
5    BC
6    BC
7    BC
8    BC
9    BC
10   BC
11   BC
12   BC
13   BC
14   BC
15   BC
16   BC
17   BC
18   BC

Am I missing something here or am I misunderstanding how series.replace works? I'm thinking this has got to be some form of logic error.

Please use: `df['col1'] = df['col1'].mask(df['col1'] == '1', 'A')` in replacement of your entire loop. — David Erickson, Dec 28 '20 at 09:34
avoid for loops if you can. @DavidErickson's solution is one way; you can also use `numpy.where` : ``df.assign(col1=np.where(df.col1 == 1, "A", "BC"))`` — sammywemmy, Dec 28 '20 at 09:35

ansev · Accepted Answer · 2020-12-28T09:55:06.827

1

in each cycle of the loop you are changing all the values again, this is inefficient, also its value may be integer and not of type string, try with numpy.where:

import numpy as np
df['col1'] = np.where(df['col1'].eq(1), 'A', 'BC')
print(df)

If you want keep other values of col1:

df['col1'] = df['col1'].replace({1: 'A', 2: 'BC'})

edited Dec 28 '20 at 09:55

answered Dec 28 '20 at 09:37

ansev

30,322
5
17
31

Pandas Series.replace replaces the entire series even in an iterative loop?

1 Answers1