I have the following df:
movie_id rating_all
0 tt7653254 [{'age': 'all', 'avg_rating': 8.1, 'count': 109326}, {'age': '<18', 'avg_rating': 8.8, 'count': 318}, {'age': '18-29', 'avg_rating': 8.3, 'count': 29740}, {'age': '30-44', 'avg_rating': 8.0, 'count': 33012}, {'age': '45+', 'avg_rating': 7.7, 'count': 7875}]
1 tt8579674 [{'age': 'all', 'avg_rating': 8.6, 'count': 9420}, {'age': '<18', 'avg_rating': 9.1, 'count': 35}, {'age': '18-29', 'avg_rating': 8.7, 'count': 2437}, {'age': '30-44', 'avg_rating': 8.5, 'count': 2529}, {'age': '45+', 'avg_rating': 8.3, 'count': 960}]
2 tt7286456 [{'age': 'all', 'avg_rating': 8.6, 'count': 592441}, {'age': '<18', 'avg_rating': 9.1, 'count': 2244}, {'age': '18-29', 'avg_rating': 8.7, 'count': 160506}, {'age': '30-44', 'avg_rating': 8.5, 'count': 160158}, {'age': '45+', 'avg_rating': 8.3, 'count': 30451}]
3 tt1302006 [{'age': 'all', 'avg_rating': 8.1, 'count': 187675}, {'age': '<18', 'avg_rating': 8.7, 'count': 461}, {'age': '18-29', 'avg_rating': 8.3, 'count': 41951}, {'age': '30-44', 'avg_rating': 7.9, 'count': 59729}, {'age': '45+', 'avg_rating': 7.8, 'count': 18550}]
4 tt7131622 [{'age': 'all', 'avg_rating': 7.8, 'count': 323152}, {'age': '<18', 'avg_rating': 8.4, 'count': 955}, {'age': '18-29', 'avg_rating': 7.9, 'count': 82133}, {'age': '30-44', 'avg_rating': 7.6, 'count': 95878}, {'age': '45+', 'avg_rating': 7.5, 'count': 26383}]
5 tt8637428 [{'age': 'all', 'avg_rating': 7.7, 'count': 21362}, {'age': '<18', 'avg_rating': 8.0, 'count': 45}, {'age': '18-29', 'avg_rating': 7.9, 'count': 5901}, {'age': '30-44', 'avg_rating': 7.6, 'count': 6492}, {'age': '45+', 'avg_rating': 7.3, 'count': 2133}]
And I want to transform it to something like:
movie_id all_avg all_count <18_avg <18_count 18-29_avg
0 tt7653254 8.1 109326 8.8 318 8.3
1 tt8579674 8.6 9420 9.1 35 8.7
2 tt7286456 8.6 592441 9.1 2244 8.7
3 tt1302006 8.1 187675 8.7 461 8.3
4 tt7131622 7.8 323152 8.4 955 7.9
5 tt8637428 7.7 21362 8 45 7.9
and so on...
I've tried
ratings.set_index('movie_id')['rating_all'].apply(pd.Series).reset_index()
and using
json_normalize(data,
record_path=['rating_all'],
meta=['movie_id']).set_index('movie_id')
but none give something similiar. Is there an easy way to explode the dictionary into columns?
Data to produce the DataFrame:
data = {'movie_id': ['tt7653254', 'tt8579674', 'tt7286456', 'tt1302006', 'tt7131622', 'tt8637428'],
'rating_all': [[{'age': 'all', 'avg_rating': 8.1, 'count': 109326},
{'age': '<18', 'avg_rating': 8.8, 'count': 318},
{'age': '18-29', 'avg_rating': 8.3, 'count': 29740},
{'age': '30-44', 'avg_rating': 8.0, 'count': 33012},
{'age': '45+', 'avg_rating': 7.7, 'count': 7875}],
[{'age': 'all', 'avg_rating': 8.6, 'count': 9420},
{'age': '<18', 'avg_rating': 9.1, 'count': 35},
{'age': '18-29', 'avg_rating': 8.7, 'count': 2437},
{'age': '30-44', 'avg_rating': 8.5, 'count': 2529},
{'age': '45+', 'avg_rating': 8.3, 'count': 960}],
[{'age': 'all', 'avg_rating': 8.6, 'count': 592441},
{'age': '<18', 'avg_rating': 9.1, 'count': 2244},
{'age': '18-29', 'avg_rating': 8.7, 'count': 160506},
{'age': '30-44', 'avg_rating': 8.5, 'count': 160158},
{'age': '45+', 'avg_rating': 8.3, 'count': 30451}],
[{'age': 'all', 'avg_rating': 8.1, 'count': 187675},
{'age': '<18', 'avg_rating': 8.7, 'count': 461},
{'age': '18-29', 'avg_rating': 8.3, 'count': 41951},
{'age': '30-44', 'avg_rating': 7.9, 'count': 59729},
{'age': '45+', 'avg_rating': 7.8, 'count': 18550}],
[{'age': 'all', 'avg_rating': 7.8, 'count': 323152},
{'age': '<18', 'avg_rating': 8.4, 'count': 955},
{'age': '18-29', 'avg_rating': 7.9, 'count': 82133},
{'age': '30-44', 'avg_rating': 7.6, 'count': 95878},
{'age': '45+', 'avg_rating': 7.5, 'count': 26383}],
[{'age': 'all', 'avg_rating': 7.7, 'count': 21362},
{'age': '<18', 'avg_rating': 8.0, 'count': 45},
{'age': '18-29', 'avg_rating': 7.9, 'count': 5901},
{'age': '30-44', 'avg_rating': 7.6, 'count': 6492},
{'age': '45+', 'avg_rating': 7.3, 'count': 2133}]]}