-1

I have the data frame below and I intend to use it for a ML regression model.

I want to transform features into separate columns on the frame with a 1 if feature exists or 0 if it doesn't. This is to train my model.

example if feature is [cubierta] I want to add a new column named feature_1 with value for the specific row 0 and so on and so forth.

The sequence of items in sequence column is ordered. It is sequential.

Are there existing panda methods that can help?

Sure I can run list(df.features) on the feature column but I don't know how to proceed then.

data frame

desertnaut
  • 57,590
  • 26
  • 140
  • 166
mremane
  • 103
  • 2
  • 9
  • 1
    Check out this post. I think this will do what you are looking for: https://stackoverflow.com/questions/38088652/pandas-convert-categories-to-numbers – erratic_strategist Nov 29 '18 at 14:47

1 Answers1

0

pd.get_dummies does exactly what you want:

df = pd.DataFrame({'district':['Eixample', 'Sants-Muntuïc'], 'features':['Cubierta', 'Plaza de coche']})
print(df)

        district        features
0       Eixample        Cubierta
1  Sants-Muntuïc  Plaza de coche

pd.get_dummies(df, columns=['features'])

       district        features_Cubierta  features_Plaza de coche
0      Eixample                  1                        0
1    Sants-Muntuïc               0                        1

Salut :)

yatu
  • 86,083
  • 12
  • 84
  • 139