categorical variables to binary variables

Question

I have a DataFrame that looks like this : initial dataframe

I have different tags in the 'Concepts_clean' column and I want to automatically fill the other ones like so : resulting dataframe

For example: fourth row, column 'Concepts_clean" I have ['Accueil Amabilité', 'Tarifs'] then I want to fill the columns 'Accueil Amabilité' and 'Tarifs' with ones and all the others with zeros.

What is the most effective way to do it?

Thank you

Well not exactly since I want ones where a tag that is in "concepts_clean" is present, I don't know if that's clear — Toto Tata, May 31 '18 at 09:01
Hi. Please take the time to read this post on [how to provide a great pandas example](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) as well as how to provide a [minimal, complete, and verifiable example](http://stackoverflow.com/help/mcve) and revise your question accordingly. These tips on [how to ask a good question](http://stackoverflow.com/help/how-to-ask) may also be useful. — jezrael, May 31 '18 at 09:02
[Please don't post images of code (or links to them)](http://meta.stackoverflow.com/questions/285551/why-may-i-not-upload-images-of-code-on-so-when-asking-a-question) — jezrael, May 31 '18 at 09:02

score 0 · Accepted Answer · answered May 31 '18 at 09:15

It's more of a n-hot encoding problem -

>>> def change_df(x):
...  for i in x['Concepts_clean'].replace('[','').replace(']','').split(','):
...   x[i.strip()] = 1
...  return x
...
>>> df.apply(change_df, axis=1)

Example Output

Concepts_clean          Ecoute  Informations  Tarifs
[Tarifs]                 0.0           0.0     1.0
[]                       0.0           0.0     0.0
[Ecoute]                 1.0           0.0     0.0
[Tarifs, Informations]   0.0           1.0     1.0

categorical variables to binary variables

1 Answers1