I have a Pandas dataframe that looks as follows:
ID Cat
1 SF
1 W
1 F
2 R64
2 SF
2 F
The first column is an identifier and the second column contains categorical data where the order is as follows: R64 < SF < F < W
I want a new dataframe that contains for each ID the maximum categorical value. The resulting dataframe should look as follows:
ID Cat
1 W
2 F
I tried the solution from this thread, but it does not seem to work for categorical data: df.groupby("ID", as_index=False).Cat.max()
The result with this approach looks like this:
ID number
1 SF
2 SF
I declare the categorical column like this:
df['Cat'] = pd.Categorical(df['Cat'], categories = ["R64", "SF", "F", "W"], ordered = True)