0

I've this dataframe

df_stack = ([['age', 0.0417084448844436],
 ['age', 0.0417084448844436],
 ['sex', 0.0506801187398187],
 ['sex', 0.0506801187398187],
 ['bp', 0.1506801187],
 ['bp', 0.1506801187]])

I need to get this result

     age         bp       sex
0   0.041708    0.15068   0.05068   
1   0.041708    0.15068   0.05068

When I pivot the table with:

pd.DataFrame(df_stack).pivot( columns=[0], values=[1])

I get this result

    1
    age         bp      sex
0   0.041708    NaN     NaN
1   0.041708    NaN     NaN
2   NaN         NaN     0.05068
3   NaN         NaN     0.05068
4   NaN        0.15068    NaN
5   NaN        0.15068    NaN

I was thinking about iterating over all columns values and concatenate after each column

import math
age = []
for c, v in test.iterrows():
    print (v['age'])
    if math.isnan(v['age']) == False:
        age.append(v['age'])
age
[0.0417084448844436, 0.0417084448844436]
 

But I need to go through this procedure for each column, which is not very adequate when we've 40 features. The issue, we can't assign a column as variable, otherwise I could append for each column name.

If someone has a better idea that would help me a lot.

Carlos Carvalho
  • 131
  • 1
  • 8

0 Answers0