I have the following dataset enter image description here
and I want to find the most popular genres from year to year. So I was planning to do that by first grouping the rows by year and genre, then selecting the most repeated genre for each year.
I was able to group by year and genre then finding the counts using the following code
x = df.groupby(['release_year','genres']).count()['id']
where id
is just an arbitrary column I used to find the counts for each genre
And I get the following results
release_year genres
1960 Action 5
Adventure 5
Comedy 7
Crime 2
Drama 10
...
2015 Science Fiction 54
TV Movie 6
Thriller 103
War 6
Western 4
Name: id, Length: 1000, dtype: int64
My problem is that I am unable to select the max for each year, can somebody help me do that?