I have a Series with values as lists of varying elements. Value count shows like this.
category count
[Radiometric] 76
[Ozone] 59
[Aerosol] 53
[Cryosphere] 31
[Atmospheric State, Cloud Properties] 29
[Atmospheric State, Radiometric, Surface Properties] 8
[POPs] 8
[Atmospheric State, Cloud Properties, Radiometric] 7
I want to create columns for each of this category and mark True/False for each row.
e.g.
index Aerosol Cloud Properities Radiometric ......
1 TRUE FALSE TRUE
2 FALSE TRUE TRUE
3
4
I managed to get the unique list of these categories from all items. I also could make it into separate columns by using method given in solution here.
But in my case data is incomplete/varied and thus give me a DF like below
1 2 3 4 5
25 Reactive Gas Surface Properties None None None
28 Aerosol Ozone Atmospheric State Cloud Properties None
59 Surface Properties Cryosphere None None None
68 Atmospheric State Cloud Properties None None None
73 Atmospheric State Radiometric None None None
is there a way using pandas or other python tools to covert it into the required output. I am now working with pandas.pivot_table with hint from this solution. I am using column 1 as the columns (assuming it has all the categories), but gives me a multi-index DF for each column.
pvt = tmp.pivot_table(index=tmp.index,columns="1",aggfunc="count")
Need help on how can I get to the boolean matrix/df as mentioned above.