0

I have next DataFrame in Pandas:

data1 = pd.DataFrame(data=[[1, 10, 100], [2,2,200],[3,3,300], [4,40,400]],
                     columns=['A', 'B', 'C'])

Here it is:

    A   B   C
0   1   10  100
1   2   2   200
2   3   3   300
3   4   40  400

What I want to do: find rows, where 'A' == 'B' and replace for this rows column 'C' value.

So what I want to get:

    A   B   C
0   1   10  100
1   2   2   -1
2   3   3   -1
3   4   40  400

What I already tried:

data1[data1['A']==data1['B']]

So I find necessary rows. Now I try to replace values in this rows:

data1[data1['A']==data1['B']]['C'] = -1

But data1 is the same! Looks like this difficult chain indexing goes wrong or all this operation returns copy of dataframe. But I can't save it to new dataframe, because I used = in last command, I just can't write newdf = data1[...] = -1. I found also replace function:

data1.replace(data1[data1['A']==data1['B']], "-1")

But it replace all values in row, when I need only last column:

    A   B   C
0   1   10  100
1   -1  -1  -1
2   -1  -1  -1
3   4   40  400

P.S. I know I can do it by using for loop. But I try to find better (more elegant) solution.

Mikhail_Sam
  • 10,602
  • 11
  • 66
  • 102

4 Answers4

2

use DataFrame.loc!

mask = data1['A'] == data1['B']
data1.loc[mask, 'C'] = -1
RandomBob
  • 106
  • 1
1

df['C'] = np.where(df.A == df.B, -1, df.C)

SherylHohman
  • 16,580
  • 17
  • 88
  • 94
Roma
  • 449
  • 6
  • 23
1

You can using pandas mask

data1.C=data1.C.mask(data1.A==data1.B,-1)
data1
Out[371]: 
   A   B    C
0  1  10  100
1  2   2   -1
2  3   3   -1
3  4  40  400
BENY
  • 317,841
  • 20
  • 164
  • 234
1
data1.loc[data1.A==data1.B,'C']='-1'
Rehan Azher
  • 1,340
  • 1
  • 9
  • 17