Given the DataFrame
import pandas as pd
df = pd.DataFrame({
'transformed': ['left', 'right', 'left', 'right'],
'left_f': [1, 2, 3, 4],
'right_f': [10, 20, 30, 40],
'left_t': [-1, -2, -3, -4],
'right_t': [-10, -20, -30, -40],
})
I want to create two new columns, picking from either left_*
or right_*
depending on the content of transformed
:
df['transformed_f'] = df['right_f'].where(
df['transformed'] == 'right',
df['left_f']
)
df['transformed_t'] = df['right_t'].where(
df['transformed'] == 'right',
df['left_t']
)
And I get the expected result
df
# transformed left_f right_f left_t right_t transformed_f transformed_t
# 0 left 1 10 -1 -10 1 -1
# 1 right 2 20 -2 -20 20 -20
# 2 left 3 30 -3 -30 3 -3
# 3 right 4 40 -4 -40 40 -40
However when I try to do it in one operation I get an unexpected result containing NaN
values
df[['transformed_f', 'transformed_t']] = df[['right_f', 'right_t']].where(
df['transformed'] == 'right',
df[['left_f', 'left_t']]
)
df
# transformed left_f right_f left_t right_t transformed_f transformed_t
# 0 left 1 10 -1 -10 NaN NaN
# 1 right 2 20 -2 -20 20.0 -20.0
# 2 left 3 30 -3 -30 NaN NaN
# 3 right 4 40 -4 -40 40.0 -40.0
Is there a way to use df.where()
on multiple columns at once?