pandas get minimum of one column in group when groupby another

Question

I have a pandas dataframe that looks like this:

      c     y
0     9     0
1     8     0
2     3     1
3     6     2
4     1     3
5     2     3
6     5     3
7     4     4
8     0     4
9     7     4

I'd like to groupby y and get the min and max of c so that my new dataframe would look like this:

      c     y     min   max
0     9     0     8     9
1     8     0     8     9
2     3     1     3     3   
3     6     2     6     6 
4     1     3     1     5
5     2     3     1     5
6     5     3     1     5
7     4     4     0     7
8     0     4     0     7
9     7     4     0     7

I tried using df['min'] = df.groupby(['y'])['c'].min() but that gave me some weird results. The first 175 rows were populated in the min column but then it went to NaN for all the rest. is that not how you're supposed to use the groupby method?

I have a related question - say instead of min and max, you want to find out which index is minimum for each group of `y`, or which value of another column is minimum for each group of `y`. — Gaurav Singhal, Apr 11 '23 at 20:32

score 36 · Accepted Answer · answered Jun 28 '18 at 04:57

Option 1 Use transform

In [13]: dfc = df.groupby('y')['c']

In [14]: df.assign(min=dfc.transform(min), max=dfc.transform(max))
Out[14]:
   c  y  max  min
0  9  0    9    8
1  8  0    9    8
2  3  1    3    3
3  6  2    6    6
4  1  3    5    1
5  2  3    5    1
6  5  3    5    1
7  4  4    7    0
8  0  4    7    0
9  7  4    7    0

Or

In [15]: df['min' ] = dfc.transform('min')

In [16]: df['max' ] = dfc.transform('max')

Option 2 Use join and agg

In [30]: df.join(df.groupby('y')['c'].agg(['min', 'max']), on='y')
Out[30]:
   c  y  min  max
0  9  0    8    9
1  8  0    8    9
2  3  1    3    3
3  6  2    6    6
4  1  3    1    5
5  2  3    1    5
6  5  3    1    5
7  4  4    0    7
8  0  4    0    7
9  7  4    0    7

Option 3 Use merge and agg

In [28]: df.merge(df.groupby('y')['c'].agg(['min', 'max']), right_index=True, left_on='y')
Out[28]:
   c  y  min  max
0  9  0    8    9
1  8  0    8    9
2  3  1    3    3
3  6  2    6    6
4  1  3    1    5
5  2  3    1    5
6  5  3    1    5
7  4  4    0    7
8  0  4    0    7
9  7  4    0    7

Using the group by pipe `df.groupby('y')['c'].pipe(lambda g: df.assign(min=g.transform('min'), max=g.transform('max')))` — piRSquared, Jun 28 '18 at 05:24

score 5 · Answer 2 · answered Jun 28 '18 at 05:17

With Numpy shenanigans

n = df.y.max() + 1
omax = np.ones(n, df.c.values.dtype) * df.c.values.min()
omin = np.ones(n, df.c.values.dtype) * df.c.values.max()
np.maximum.at(omax, df.y.values, df.c.values)
np.minimum.at(omin, df.y.values, df.c.values)

df.assign(min=omin[df.y], max=omax[df.y])

   c  y  min  max
0  9  0    8    9
1  8  0    8    9
2  3  1    3    3
3  6  2    6    6
4  1  3    1    5
5  2  3    1    5
6  5  3    1    5
7  4  4    0    7
8  0  4    0    7
9  7  4    0    7

pandas get minimum of one column in group when groupby another

2 Answers2

Linked

Related