Remove NaNs and consolidate remaining values in to one column

Question

I have a dataframe where I've removed unwanted values and replaced with NaN. I now want to consolidate my columns in to one column that contains the valid values

Name    Set_1   Set_2   Set_3
Task1   A       NaN     NaN 
Task2   NaN     B       NaN 
Task3   NaN     NaN     C

I don't want to drop entire rows containing NaN as they contain valid values, rather I want to drop the NaN values and consolidate the data so it looks like this:

Name    Set
Task1   A
Task2   B
Task3   C

I can do this manually on each column by locating the columns with valid values, creating a new dataframe and then dropping the columns without values:

df2 = df.loc[df['Set_1'] != 'NaN']
df2.drop(columns = ["Set_2", "Set_3"])

and then combining each new dataframe, but this is not really practical in a larger scenario. I'm looking for a more efficient way to do this as my real dataset has a lot more columns.

score 1 · Answer 1 · answered Sep 07 '21 at 13:19

1

You can use melt + groupby/first:

(df.melt(id_vars='Name', value_name='Set')
   .groupby('Name')
   ['Set'].first()
   .reset_index()
)

output:

    Name Set
0  Task1   A
1  Task2   B
2  Task3   C

answered Sep 07 '21 at 13:19

mozway

194,879
13
39
75

score 1 · Answer 2 · answered Sep 07 '21 at 13:21

1

Try with bfill:

>>> df.bfill(axis=1).dropna(axis=1)
    Name Set_1
0  Task1     A
1  Task2     B
2  Task3     C

answered Sep 07 '21 at 13:21

not_speshal

22,093
2
15
30

Remove NaNs and consolidate remaining values in to one column

2 Answers2