Let's say we want to, replace values in A_1
and 'A_2' according to a mask in B_1
and B_2
. For example, replace those values in A
(to 999) that corresponds to nulls in B
.
The original dataframe:
A_1 A_2 B_1 B_2
0 1 4 y n
1 2 5 n NaN
2 3 6 NaN NaN
The desired dataframe
A_1 A_2 B_1 B_2
0 1 4 y n
1 2 999 n NaN
2 999 999 NaN NaN
The code:
df = pd.DataFrame({
'A_1': [1, 2, 3],
'A_2': [4, 5, 6],
'B_1': ['y', 'n', np.nan],
'B_2': ['n', np.nan, np.nan]})
_mask = df[['B_1', 'B_2']].notnull()
df[['A_1', 'A_2']] = df[['A_1','A_2']].where(_mask, other=999)
But I get the following instead:
A_1 A_2 B_1 B_2
0 999 999 y n
1 999 999 n NaN
2 999 999 NaN NaN
So, all of the are replaced, not just the not null. How can I fixed this>
I was following this answer but it only works if I use the same columns:
df[['B_1', 'B_2']] = odf[['B_1','B_2']].where(_mask, other=999)
A_1 A_2 B_1 B_2
0 1 4 y n
1 2 5 n 999
2 3 6 999 999