Python: how to make conditional operations in pandas?

Question

I have a dataframe df like the following

df   A   B    C
0    1   0.7 0.3
1    0   0.2 0.8
2    0   0.8 0.2
3    1   0.6 0.4
4    1   0.9 0.1

I want to create a column D that has values (1-B) if A==1 or (1-C) if A==0. So

df   A   B    C    D
0    1   0.7 0.3  0.3
1    0   0.2 0.8  0.2
2    0   0.8 0.2  0.8
3    1   0.6 0.4  0.4
4    1   0.9 0.1  0.1

jezrael · Answer 1 · 2018-07-22T12:24:28.420

If sum by B and C columns get 1 is possible use numpy.where without subtracting:

df['D'] = np.where(df['A'] == 0, df['B'], df['C'])
print (df)
   A    B    C    D
0  1  0.7  0.3  0.3
1  0  0.2  0.8  0.2
2  0  0.8  0.2  0.8
3  1  0.6  0.4  0.4

If want use formula and A column contains only 1 and 0 values:

df['D'] = np.where(df['A'] == 0, 1 - df['C'], 1 - df['B'])
print (df)
   A    B    C    D
0  1  0.7  0.3  0.3
1  0  0.2  0.8  0.2
2  0  0.8  0.2  0.8
3  1  0.6  0.4  0.4
4  1  0.9  0.1  0.1

If possible multiple values in A column (most general solution) use numpy.select:

print (df)
   A    B    C
0  1  0.7  0.3
1  0  0.2  0.8
2  0  0.8  0.2
3  1  0.6  0.4
4  3  0.9  0.1 <- added 3

m1 = df['A'] == 0
m2 = df['A'] == 1
df['D'] = np.select([m1, m2], [1 - df['C'], 1 - df['B']], default=np.nan)
print (df)
   A    B    C    D
0  1  0.7  0.3  0.3
1  0  0.2  0.8  0.2
2  0  0.8  0.2  0.8
3  1  0.6  0.4  0.4
4  3  0.9  0.1  NaN

score 0 · Answer 2 · answered Jul 22 '18 at 13:50

0

np.select() and np.where() are the way to go.

One more option, can also do

df.loc[df.A == 1, 'D'] = 1 - df.B
df.loc[df.A == 0, 'D'] = 1 - df.C

answered Jul 22 '18 at 13:50

rafaelc

57,686
15
58
82

Python: how to make conditional operations in pandas?

2 Answers2