I'm working with the UK parliamentary election dataset from researchbriefings.parliament.uk, which for some reason has thought it worth including the vote share as a proportion as well as a number, but not worth including which party actually won in each constituency.
But with pandas
that should be simple to swiftly calculate, if we use pandas.DataFrame.idxmax
(as recommended in this question)...
-- However, my code refuses to work and I'm not sure why.
Here's the code:
import pandas as pd
# read in the file
elections = pd.read_csv(os.path.join(sys.path[0], '1918-2017election_results.csv'), encoding='cp1252')
# remove whitespace from column names
elections.rename(columns=lambda x: x.strip(), inplace=True)
# find winning number of votes for each constituency
parties = ['con','lib','lab','natSW','oth']
elections['winning_votes'] = elections[(f'{party}_votes' for party in parties)].max(axis=1)
# find winning party for each constituency
elections['winning_party'] = elections[(f'{party}_votes' for party in parties)].idxmax(axis=1)
and this is the error I get:
TypeError: reduction operation 'argmax' not allowed for this dtype
Please tell me what I'm doing wrong or how I can use an alternative pythonic method of finding the winning party for each constituency. Thanks!