I have a movies dataset dataframe with the following columns:-
movieid
title
audience_score
rating
runtime
genre
language
Problem is that there are multiple entries of the same movieid and for the same movie_id there may be different rating, runtime, genre, language etc.
Now what I want to do is to group together these entries with the same movieid, and then replace this 'group' by a single entry corresponding to one movieid. And I want that the feature values for this single entry should be derived from the feature values of the group it is going to replace.
Like for example, I want that the single entry's rating should be the median of all the ratings from the group to which it belonged, its genre should be the most frequent entry in the genre column of the group to which it belonged.
Is there a way to do this? I explored the group_by function but could not find something which helped me achieve what I want.