0

I have a DataFrame that looks like this : initial dataframe

I have different tags in the 'Concepts_clean' column and I want to automatically fill the other ones like so : resulting dataframe

For example: fourth row, column 'Concepts_clean" I have ['Accueil Amabilité', 'Tarifs'] then I want to fill the columns 'Accueil Amabilité' and 'Tarifs' with ones and all the others with zeros.

What is the most effective way to do it?

Thank you

Toto Tata
  • 43
  • 1
  • 5
  • need `df = df.fillna(0)` ? – jezrael May 31 '18 at 08:57
  • Well not exactly since I want ones where a tag that is in "concepts_clean" is present, I don't know if that's clear – Toto Tata May 31 '18 at 09:01
  • Hi. Please take the time to read this post on [how to provide a great pandas example](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) as well as how to provide a [minimal, complete, and verifiable example](http://stackoverflow.com/help/mcve) and revise your question accordingly. These tips on [how to ask a good question](http://stackoverflow.com/help/how-to-ask) may also be useful. – jezrael May 31 '18 at 09:02
  • [Please don't post images of code (or links to them)](http://meta.stackoverflow.com/questions/285551/why-may-i-not-upload-images-of-code-on-so-when-asking-a-question) – jezrael May 31 '18 at 09:02

1 Answers1

0

It's more of a n-hot encoding problem -

>>> def change_df(x):
...  for i in x['Concepts_clean'].replace('[','').replace(']','').split(','):
...   x[i.strip()] = 1
...  return x
...
>>> df.apply(change_df, axis=1)

Example Output

Concepts_clean          Ecoute  Informations  Tarifs
[Tarifs]                 0.0           0.0     1.0
[]                       0.0           0.0     0.0
[Ecoute]                 1.0           0.0     0.0
[Tarifs, Informations]   0.0           1.0     1.0
Vivek Kalyanarangan
  • 8,951
  • 1
  • 23
  • 42