Building on this question Combining columns and removing NaNs Pandas,
I have a dataframe that looks like this:
col x y z
a1 a NaN NaN
a2 NaN b NaN
a3 NaN c NaN
a4 NaN NaN d
a5 NaN e NaN
a6 f NaN NaN
a7 g NaN NaN
a8 NaN NaN NaN
The cell values are strings and the NaNs are arbitrary null values.
I would like to combine the columns to add a new combined column thus:
col w
a1 a
a2 b
a3 c
a4 d
a5 e
a6 f
a7 g
a8 NaN
The elegant solution proposed in the question above uses
df['w']=df[['x','y','z']].sum(axis=1)
but sum does not work for non-numerical values.
How, in this case for strings, do I combine the columns into a single column?
You can assume:
- Each row only has one of
x
,y
,z
that is non-null. - The individual columns must be referenced by name (since they are a subset of all of the available columns in the dataframe).
- In general there are N and not just 3 columns in the subset.
- Hopefully no use for iloc/for loops :\
Update: (apologies to those who have already given answers :\ )
- I have added a final row where every column contains NaN, and I would like the combined row to reflect that. Thanks + sorry!
Thanks as ever for all help