1

Given a pandas dataframe with company purchases across various months in a year, how do I find the "N" highest each month?

Currently have:

df.groupby(df['Transaction Date'].dt.strftime('%B'))['Amount'].max()

Which is returning the highest value for each month but would like to see the highest four values.

Am I getting close here or is there a more efficient approach? Thanks in advance

mkelly
  • 19
  • 2
  • 2
    Welcome to stack overflow! There's a built-in function [dataframe.nlargest()](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.nlargest.html) that seems pretty appropriate, but it's hard to answer without sample input and output to make a [mcve] – G. Anderson Feb 10 '20 at 22:49
  • Does this answer your question? [Pandas get topmost n records within each group](https://stackoverflow.com/questions/20069009/pandas-get-topmost-n-records-within-each-group) – AMC Feb 10 '20 at 23:02
  • Have you done any research? See: [ask], https://meta.stackoverflow.com/questions/261592/how-much-research-effort-is-expected-of-stack-overflow-users – AMC Feb 10 '20 at 23:02

1 Answers1

3

With sort_values then tail

yourdf=df.sort_values('Amount').groupby(df['Transaction Date'].dt.strftime('%B'))['Amount'].tail(4)
BENY
  • 317,841
  • 20
  • 164
  • 234