1

I have the data frame shown below. It is sorted so that 'POP' is in descending order with respect to each 'STATE'. Now I want to sum the largest three values of 'POP' with respect to each 'STATE', how should I do this?

import pandas as pd
d = [['X','q',123383],['X','w',43857349],['X','e',236657],['X','r',23574594],
        ['Y','t',547853],['Y','y',46282134],['Y','u',43857439],['Y','i',32654893],['Y','i',95678312]]
df = pd.DataFrame(d, columns = ['STATE','COUNTY','POP'])
df.sort_values(['STATE','POP'], ascending=[True, False]).set_index(['STATE','COUNTY'])

print(sorted_df)

# sorted_df:
                   POP
STATE COUNTY          
X     w       43857349
      r       23574594
      e         236657
      q         123383
Y     i       95678312
      y       46282134
      u       43857439
      i       32654893
      t         547853
  • Possible duplicate of [Pandas groupby nlargest sum](https://stackoverflow.com/questions/40390634/pandas-groupby-nlargest-sum) – Quang Hoang Jul 08 '19 at 22:51

2 Answers2

3

There's nlargest that does not require pre-sort:

df.groupby(['STATE']).POP.nlargest(3)

gives you

STATE   
X      1    43857349
       3    23574594
       2      236657
Y      8    95678312
       5    46282134
       6    43857439
Name: POP, dtype: int64

and if you only care for the sum:

df.groupby(['STATE']).POP.nlargest(3).sum(level=0)

gives:

STATE
X     67668600
Y    185817885
Name: POP, dtype: int64
Quang Hoang
  • 146,074
  • 10
  • 56
  • 74
1

Make sure to reassign after you sort the dataframe (perhaps you meant to call the result sorted_df).

Group by the state level (or level=0 given that it is a multi-index of State and County), then apply a lambda taking the head for the top three (given that the data was sorted in descending order by population for each state) and sum the result.

top_n = 3
df = df.sort_values(['STATE','POP'], ascending=[True, False]).set_index(['STATE','COUNTY'])
>>> df.groupby(level='STATE').apply(lambda x: x.head(top_n).sum())
             POP
STATE           
X       67668600  # w: 43857349 + r: 23574594 + e: 236657
Y      185817885  # i: 95678312 + y: 46282134 + u: 43857439
Alexander
  • 105,104
  • 32
  • 201
  • 196