I've got a pandas DataFrame. In this DataFrame I want to modify several columns of some rows. These are the approaches I've attempted.
df[['finalA', 'finalB']] = df[['A', 'B']]
exceptions = df.loc[df.normal == False]
Which works like a charm, but now I want to set the exceptions:
df.loc[exceptions.index, ['finalA', 'finalB']] = \
df.loc[exceptions.index, ['A_except', 'B_except']]
Which doesn't work. So I tried using .ix
from this answer.
df.ix[exceptions.index, ['finalA', 'finalB']] = \
df.ix[exceptions.index, ['A_except', 'B_except']]
Which doesn't work either. Both methods give me NaN
in both finalA
and finalB
for the exceptional rows.
The only way that seems to work is doing it one column at a time:
df.ix[exceptions.index, 'finalA'] = \
df.ix[exceptions.index, 'A_except']
df.ix[exceptions.index, 'finalB'] = \
df.ix[exceptions.index, 'B_except']
What's going on here in pandas? How do I avoid setting the values to the copy that is apparently made by selecting multiple columns? Is there a way to avoid this kind of code repetition?
Some more musings: It doesn't actually set the values to a copy of the dataframe, it sets the values to NaN. It actually overwrites them to a new value.
Sample dataframe:
import pandas as pd
df = pd.DataFrame({'A': [1,2,3,4],
'B': [5,6,7,8],
'normal': [True, True, False, False],
'A_except': [0,0,9,9],
'B_except': [0,0,10,10]})
Result:
A A_except B B_except normal finalA finalB
0 1 0 5 0 True 1.0 5.0
1 2 0 6 0 True 2.0 6.0
2 3 9 7 10 False NaN NaN
3 4 9 8 10 False NaN NaN
Expected result:
A A_except B B_except normal finalA finalB
0 1 0 5 0 True 1 5
1 2 0 6 0 True 2 6
2 3 9 7 10 False 9 10
3 4 9 8 10 False 9 10