How to reshape boolean columns from wide to long table

Question

I have a wide dataframe, with several columns of that describe the type of each row using booleans.

df = pd.DataFrame({'ID': [0, 1, 2, 3], 'Name': ['x', 'y', 'z', 'e'],
                   'Type One': [True, False, False, False], 
                   'Type Two': [False, True, True, True]})

which looks like:

   ID Name  Type One  Type Two
0   0    x      True     False
1   1    y     False      True
2   2    z     False      True
3   3    e     False      True

I would prefer that this dataframe be in a long format, such that each row is matched to a Type:

   ID Name  Type  
0   0    x     Type One
1   1    y     Type Two
2   2    z     Type Two
3   3    e     Type Two

Note: I believe that all ID's have only 1 type, (so where Type One is True, Type N must be False).

@ALollz the original post's column names were chosen for simplicity's sake... so I supposed you'd have to list each column name and pass those as an argument to filter()? — Eli, Nov 08 '19 at 18:19
@Eli then in that case just pass a list to select columns by labels instead of using filter: `df[['Type One', 'Type Two']].idxmax(1)` — ALollz, Nov 08 '19 at 18:20

score 2 · Accepted Answer · answered Nov 08 '19 at 18:01

2

This is dot:

s = df[['Type One', 'Type Two']]
s.dot(s.columns)

Output:

0    Type One
1    Type Two
2    Type Two
3    Type Two
dtype: object

answered Nov 08 '19 at 18:01

Quang Hoang

146,074
10
56
74

Thank you! I see that this is essentially dot product. If my original assumption (`all ID's have only 1 type, (so where Type One is True, Type N must be False)`) isn't true, will there be any way of knowing? Will this dot() still work? – Eli Nov 08 '19 at 18:18
`s.sum(axis=1).eq(1).all()` would check that condition for you. Dot still works, but it would return concatenation of all positive column names in that row. – Quang Hoang Nov 08 '19 at 18:21

score 0 · Answer 2 · answered Nov 08 '19 at 18:01

df['Type'] = None
df.loc[df['Type One'], 'Type'] = 'Type One'
df.loc[df['Type Two'], 'Type'] = 'Type Two'
df.drop(columns=['Type One', 'Type Two'], inplace=True)

Output:

   ID Name      Type
0   0    x  Type One
1   1    y  Type Two
2   2    z  Type Two
3   3    e  Type Two

How to reshape boolean columns from wide to long table

2 Answers2