I need your help. A dataframe stores the probabilities of three categories, as follows:
dict_test = {'series': [1, 2, 3, 4, 5, 6, 7],
'cat_1': [.02, .02, .81, .72, .01, .3, .45],
'cat_2': [.02, .02, .14, .2, .99, .45, .4],
'cat_3': [.96, .96, .05, .08, .00, .25, .15]}
import pandas as pd
df = pd.DataFrame(dict_test)
I need to create a new column to store which category has the highest probability. What I've been able to do so far is select the highest probability using the agg function:
df['choice'] = df.drop('series', axis = 1).agg(max, axis = 1)
The result I need is exemplified with this dataframe:
dict_test = {'series': [1, 2, 3, 4, 5, 6, 7],
'cat_1': [.02, .02, .81, .72, .01, .3, .45],
'cat_2': [.02, .02, .14, .2, .99, .45, .4],
'cat_3': [.96, .96, .05, .08, .00, .25, .15],
'result': ['cat_3', 'cat_3', 'cat_1', 'cat_1', 'cat_2', 'cat_2', 'cat_1']}
df = pd.DataFrame(dict_test)
Any suggestion?