Convert string to columns - Data Frame

Question

I have the data frame below and I intend to use it for a ML regression model.

I want to transform features into separate columns on the frame with a 1 if feature exists or 0 if it doesn't. This is to train my model.

example if feature is [cubierta] I want to add a new column named feature_1 with value for the specific row 0 and so on and so forth.

The sequence of items in sequence column is ordered. It is sequential.

Are there existing panda methods that can help?

Sure I can run list(df.features) on the feature column but I don't know how to proceed then.

Check out this post. I think this will do what you are looking for: https://stackoverflow.com/questions/38088652/pandas-convert-categories-to-numbers — erratic_strategist, Nov 29 '18 at 14:47

score 0 · Accepted Answer · answered Nov 29 '18 at 15:10

pd.get_dummies does exactly what you want:

df = pd.DataFrame({'district':['Eixample', 'Sants-Muntuïc'], 'features':['Cubierta', 'Plaza de coche']})
print(df)

        district        features
0       Eixample        Cubierta
1  Sants-Muntuïc  Plaza de coche

pd.get_dummies(df, columns=['features'])

       district        features_Cubierta  features_Plaza de coche
0      Eixample                  1                        0
1    Sants-Muntuïc               0                        1

Salut :)

Convert string to columns - Data Frame

1 Answers1