I'd like to count multiple values (contained in a list per cell) on a groupBy object.
I have the following dataframe:
| | Record the respondent’s sex | 7. What do you use the phone for? |
|---|-----------------------------|---------------------------------------------|
| 0 | Male | sending texts;calls;receiving sending texts |
| 1 | Female | sending texts;calls;WhatsApp;Facebook |
| 2 | Male | sending texts;calls;receiving texts |
| 3 | Female | sending texts;calls |
I would like to count every value in column 7. What do you use the phone for?
, after grouping on Record the respondent’s sex
.
I have no problem doing this when there is only one value per cell.
grouped = df.groupby(['Record the respondent’s sex'], sort=True)
question_counts = grouped['2. Are you a teacher, caregiver, or young adult ?'].value_counts(normalize=False, sort=True)
question_data = [
{'2. Are you a teacher, caregiver, or young adult ?': question, 'Record the respondent’s sex': group, 'count': count*100} for
(group, question), count in dict(question_counts).items()]
df_question = pd.DataFrame(question_data)
Gives me a table which looks exactly like this:
| 7. What do you use the phone for? | Record the respondent's sex | count |
|-----------------------------------|-----------------------------|-------|
| sending texts | Male | 2 |
| calls | Male | 2 |
| receiving texts | Male | 2 |
| sending texts | Female | 2 |
| calls | Female | 2 |
| WhatsApp | Female | 1 |
| Facebook | Female | 1 |
If only I could get this working with multiple values!
value_counts()
doesn't work on lists with multiple values, it throws an TypeError: unhashable type: 'list'
error. The question Counting occurrence of values in a Panda series? shows how to deal with this in various ways, but I can't seem to get it to work on a GroupBy object.