Basically I want to extract the highest value count of the genre for each year and then plot it in a bar chart to answer the question - Which genre is most popular from year to year?
Asked
Active
Viewed 46 times
0
-
Post a minimum reproducible example that can be copied. Follow the guide [here](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). – Toukenize Mar 09 '20 at 06:10
1 Answers
1
First idea is create 3 columns DataFrame
by #Series.reset_index
, remove duplicates by DataFrame.drop_duplicates
and reshape by DataFrame.pivot
:
df = (temp_1.reset_index(name='count')
.drop_duplicates('release_year')
.pivot('release_year','genres','count'))
Or remove duplicates in MultiIndex
by Index.get_level_values
with Index.duplicated
and boolean indexing
, reshape by Series.unstack
and last create 3 columns DataFrame
:
df = (temp_1[~temp_1.index.get_level_values('release_year').duplicated()]
.unstack()
.reset_index(name='count'))
Last plot by DataFrame.plot.bar
:
df.plot.bar()

jezrael
- 822,522
- 95
- 1,334
- 1,252