0

So I have this dataframe and I wanna replace some of its rows with another value based on a condition.

df = pd.Dataframe({'col1':[1,1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3]})
for rows in df['col1']:
    if rows == "1":
        df['col1'].replace({rows: "A"}, inplace=True)
    else:
        df['col1'].replace({rows: "BC"}, inplace=True)

However, the results are weird:

>>> print(df)
    col1
0    BC
1    BC
2    BC
3    BC
4    BC
5    BC
6    BC
7    BC
8    BC
9    BC
10   BC
11   BC
12   BC
13   BC
14   BC
15   BC
16   BC
17   BC
18   BC

Am I missing something here or am I misunderstanding how series.replace works? I'm thinking this has got to be some form of logic error.

Sunderam Dubey
  • 1
  • 11
  • 20
  • 40
Beelz
  • 67
  • 6

1 Answers1

1

in each cycle of the loop you are changing all the values ​​again, this is inefficient, also its value may be integer and not of type string, try with numpy.where:

import numpy as np
df['col1'] = np.where(df['col1'].eq(1), 'A', 'BC')
print(df)

If you want keep other values of col1:

df['col1'] = df['col1'].replace({1: 'A', 2: 'BC'})
ansev
  • 30,322
  • 5
  • 17
  • 31