1

I need to get pd.Series of single values from Series.mode function. Example code:

df = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3', 'A4', 'A5'],
                   'key': [0, 1, 2, 3, 3, 3]})

modes = df.groupby('key')['A'].agg(pd.Series.mode)
key A
0   A0
1   A1
2   A2
3   ['A3' 'A4' 'A5']

The problem is row '3'. It returns numpy.ndarray. How should I modify my script to get single values in all rows. It is convenient for me to get any of mode values A3, A4, A5.

Smirnoff
  • 11
  • 3
  • If you don't really care about which value in the case of ties, `mode` is essentially a groupby + size + sort + drop_duplicates: https://stackoverflow.com/questions/68533960/multiple-modes-for-multiple-accounts-in-python/68534232#68534232 – ALollz Feb 15 '22 at 14:58
  • Yes, this code helps me absolutely! test.groupby('account')['category'].agg( lambda x: np.random.choice(x.mode(dropna=False))) – Smirnoff Feb 21 '22 at 09:16

1 Answers1

1

You could explode the output to get duplicated indices:

modes = df.groupby('key')['A'].agg(pd.Series.mode).explode()

output:

key
0    A0
1    A1
2    A2
3    A3
3    A4
3    A5
Name: A, dtype: object
mozway
  • 194,879
  • 13
  • 39
  • 75