0

Hello everyone rookie coder here.

I have a panda df with a time column, an id column, and a column containing lots of strings, separated by commas, which occasionally repeat themselves such as this

id | Date        | interest
-------------------------
 1 | 2016-01-01  | Economic and Financial Affairs, Competition
 2 | 2017-05-17  | Energy, Environment
 3 | 2017-04-26  | Economic and Financial Affairs, Taxation
 4 | 2017-04-21  | Energy, Taxation
 5 | 2017-05-10  | Competition, Environment

I am trying to find a way to use .pivot_tables() to set the dates as index and to the different comma separated strings as columns counting their frequency in order to graph them.

Desired output:

Date | Econ. and Fin. Affairs| Competition | Energy
-----   -------------------    -----------  ------
2016-01-01 | 1               | 1           | 0
2017-05-17 | 0               | 0           | 1
2017-04-26 | 1               | 0           | 0
2017-04-21 | 0               | 0           | 1

And so on, and so on.

Thank you for your time

piRSquared
  • 285,575
  • 57
  • 475
  • 624
Yian
  • 167
  • 1
  • 10

1 Answers1

0
df.set_index('Date').interest.str.get_dummies(sep=', ')
piRSquared
  • 285,575
  • 57
  • 475
  • 624