3

For the given dataframe df as:

   Election Yr.  Party   States Votes
0     2000           A       a    50  
1     2000           A       b    30
2     2000           B       a    40
3     2000           B       b    50  
4     2000           C       a    30
5     2000           C       b    40
6     2005           A       a    50  
7     2005           A       b    30
8     2005           B       a    40
9     2005           B       b    50  
10    2005           C       a    30
11    2005           C       b    40

I want to get the Party that got the maximum Votes for a corresponding year. I have used the following code to groupby "Election Year" and "Party" and then .sum() to get the total votes for each party in every year.

df = df.groupby(['Election Yr.', 'Party']).sum()

Now how to get the party with maximum Votes each year? I am unable to get this.

Any support is highly appreciated.

Deepak
  • 59
  • 1
  • 8
  • I think you are looking for the idmax solution in this answer: https://stackoverflow.com/questions/10202570/find-row-where-values-for-column-is-maximal-in-a-pandas-dataframe – user14518362 Jun 09 '21 at 13:09
  • @user14518362 - That is not what OP asked for. "Maximum votes each year". – not_speshal Jun 09 '21 at 13:13
  • I tried that, but it is giving a row for the overall maximum value. But I need rows for maximum value for each year. – Deepak Jun 09 '21 at 13:13

3 Answers3

1

Try using a combination of groupby and idxmax:

gb = df.groupby(["Election Yr.", "Party"]).sum()
gb.loc[gb.groupby("Election Yr.")["Votes"].idxmax()].reset_index()
>>> gb
   Election Yr. Party  Votes
0          2000     B     90
1          2005     B     90
not_speshal
  • 22,093
  • 2
  • 15
  • 30
1

1. Using inner joins

You can start off with df before doing your first groupby. Then you get the maximum number of votes each year and merge on the year-votes combination to get the party that got the most votes per year.

# Original data
df = pd.DataFrame({'Election Yr.':[2000,2000,2000,2000,2000,2000,2005,2005,2005,2005,2005,2005],
                   'Party':['A','A','B','B','C','C','A','A','B','B','C','C',],
                   'Votes':[50,30,40,50,30,40,50,30,40,50,30,40]})

# Get number of votes per year-party
df = df.groupby(['Election Yr.','Party'])['Votes'].sum().reset_index()

# Get max number of votes per year
max_ = df.groupby('Election Yr.')['Votes'].max().reset_index()

# Merge on key
max_ = max_.merge(df, on=['Election Yr.','Votes'])

# Results
print(max_)

>    Election Yr.  Votes Party
> 0          2000     90     B
> 1          2005     90     B

2. Sorting and keeping first observation

Alternatively, you can sort by votes per year:

df = df.groupby(['Election Yr.','Party'])['Votes'].sum().reset_index()
df = df.sort_values(['Election Yr.','Votes'], ascending=False)
print(df.groupby('Election Yr.').first().reset_index())

print(df)

>    Election Yr. Party  Votes
> 0          2000     B     90
> 1          2005     B     90
Arturo Sbr
  • 5,567
  • 4
  • 38
  • 76
  • Thanks, Arturo. Consider a different dataset, How can I get the top 10 parties (in terms of getting maximum votes) for each year? – Deepak Jun 09 '21 at 14:12
  • You can use the second method and try `head(10)` instead of `first()`. Check out [this post](https://stackoverflow.com/questions/20069009/pandas-get-topmost-n-records-within-each-group). – Arturo Sbr Jun 09 '21 at 14:21
  • While using head(10) to print the top 10 parties every year, why is it printing the latest year at the top. I want the older year at the top. How can that be done? – Deepak Jun 09 '21 at 14:37
  • Sort the new data by year: `df.sort_values(Election Yr., ascending=True)`. Make sure to `reset_index()` after `head(10)`. That is: `df.groupby(....).head(10).reset_index()` and then sort again. – Arturo Sbr Jun 09 '21 at 14:42
-1

Here you can see the total number of votes given to each Party (A,B,C) according to Election Yr.

  • Thanks. Consider a different dataset, How can I get the top 10 parties (in terms of getting maximum votes) for each year? – Deepak Jun 09 '21 at 14:20
  • Don't post the images and also do explain what you're doing – Ashok Jun 09 '21 at 20:34