I've got a pandas dataset with a column that's a comma-separated string, e.g. 1,2,3,10
:
data = [
{ 'id': 1, 'score': 9, 'topics': '11,22,30' },
{ 'id': 2, 'score': 7, 'topics': '11,18,30' },
{ 'id': 3, 'score': 6, 'topics': '1,12,30' },
{ 'id': 4, 'score': 4, 'topics': '1,18,30' }
]
df = pd.DataFrame(data)
I'd like to get a count and a mean score for each value in topics
. So:
topic_id,count,mean
1,2,5
11,2,8
12,1,6
et cetera. How can I do this?
I've got as far as:
df['topic_ids'] = df.topics.str.split()
But now I guess I want to explode topic_ids
out, so there's a column for each unique value in the entire set of values...?