3

Say I have a pandas column as below

Type
type1
type2
type3

and now i will take dummies for above as follows:
type_dummies = pd.get_dummies(["Type"], prefix="type")

Then after joing it with the main DataFrame the resulting df would be something like below:

df.drop(['Type'], axis=1, inplace=True)
df = df.join(type_dummies)
df.head()

type_type1    type_type2    type_type3
   1              0             0
   0              1             0
   0              0             1

But what if in my training set there is an another category as type4 in Type column. So how would I use get_dummies() method to generate dummies as much as I want. That is, in this case I want to generate 4 dummy variables although there are only 3 categories in the desired column?

Ashan Priyadarshana
  • 3,119
  • 3
  • 29
  • 34
  • hmm yep saw that earlier. But the answers was not clear to me as one below given by @Wen. So I asked it anyway and found a good simple answer. thanks for noting anyway. – Ashan Priyadarshana Jan 30 '18 at 19:26

1 Answers1

5

You can using categroy data type

df.Type=df.Type.astype('category', categories=['type1','type2','type3','type4'])
df
Out[200]: 
    Type
0  type1
1  type2
2  type3
pd.get_dummies(df["Type"], prefix="type")
Out[201]: 
   type_type1  type_type2  type_type3  type_type4
0           1           0           0           0
1           0           1           0           0
2           0           0           1           0
BENY
  • 317,841
  • 20
  • 164
  • 234