I have a df that has one column with multiple comma-separated values in each row. I want to count how many times a unique value occurs in that column.
The df looks like this:
category country
0 widget1, widget2, widget3, widget4 USA
1 widget1, widget3 USA
2 widget1, widget2 China
3 widget2 Canada
4 widget1, widget2, widget3 China
5 widget2 Vietnam
6 widget3 Canada
7 widget1, widget3 USA
8 widget1, widget3 Japan
9 widget2 Germany
I want know how many times each widget appears in the column "category". The results in this example would be:
widget1 = 6, widget2 = 6, widget3 = 6, widget4 = 1
I can use .value_counts
df["category"].value_counts()
but that's only going to return rows that are exactly the same.
I could use value_counts and enter each value for it to count, but in the actual DataFrame there are too many rows and unique values in that column to make it practical.
Also, is there a way to not double count if a single row contains two values that are the same? For example is there was a "widget1, black widget1, yellow widget1" in the same row, I'd just want to count that as one widget1.