1

I have several category dummies that are mutually exclusive

id  cat1 cat2 cat3
A    0    0    1
B    1    0    0
C    1    0    0
D    0    0    1
E    0    1    0
F    0    0    1
..

I want to create a new column that contains all categories

id  cat1 cat2 cat3 type
A    0    0    1   cat3
B    1    0    0   cat1
C    1    0    0   cat1
D    0    0    1   cat3
E    0    1    0   cat2
F    0    0    1   cat3
..
Olive
  • 644
  • 4
  • 12

2 Answers2

2

You can use pandas.from_dummies and filter to select the columns starting with "cat":

df['type'] = pd.from_dummies(df.filter(like='cat'))

Output:

  id  cat1  cat2  cat3  type
0  A     0     0     1  cat3
1  B     1     0     0  cat1
2  C     1     0     0  cat1
3  D     0     0     1  cat3
4  E     0     1     0  cat2
5  F     0     0     1  cat3
mozway
  • 194,879
  • 13
  • 39
  • 75
  • 1
    This is great - I never used this before, had been using the sklearn [OneHotEncoder](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html) and variations of [its inverse](https://stackoverflow.com/questions/22548731/how-to-reverse-sklearn-onehotencoder-transform-to-recover-original-data) thanks! – Thomas Kimber Dec 05 '22 at 10:00
0

Use DataFrame.dot with DataFrame.filter for column with cat substring, if multiple 1 per rows are separated by ,:

m = df.filter(like='cat').eq(1)
#all columns without first
#m = df.iloc[:, 1:].eq(1)
df['type'] = m.dot(m.columns + ',').str[:-1]
print (df)
  id  cat1  cat2  cat3  type
0  A     0     0     1  cat3
1  B     1     0     0  cat1
2  C     1     0     0  cat1
3  D     0     0     1  cat3
4  E     0     1     0  cat2
5  F     0     0     1  cat3
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252