I am trying to sort a pandas dataframe. The data looks like-
year | state | district | Party | rank | share in votes |
---|---|---|---|---|---|
2010 | haryana | kaithal | Winner | 1 | 40.12 |
2010 | haryana | kaithal | bjp | 2 | 30.52 |
2010 | haryana | kaithal | NOTA | 3 | 29 |
2010 | goa | panji | Winner | 3 | 10 |
2010 | goa | panji | INC | 2 | 40 |
2010 | goa | panji | BJP | 1 | 50 |
2013 | up | meerut | Winner | 2 | 40 |
2013 | up | meerut | SP | 1 | 60 |
2015 | haryana | kaithal | Winner | 2 | 15 |
2015 | haryana | kaithal | BJP | 3 | 35 |
2015 | haryana | kaithal | INC | 1 | 50 |
This data is for multiple states for multiple years. In this dataset, there are multiple values for each district. I want to calculate the margin of share for each district in this manner. I have tried this, but not able to write fully. I am not able to write code for defining the margin of share and get a dataframe with only one (margin of share) value corresponding to each district instead of party wise shares.
for year in df['YEAR']:
for state in df['STATE']:
for district in df['DISTRICT']:
for rank in df['RANK']:
for party in df['PARTY']:
if rank==1 and party=='WINNER':
then margin of share =Share of Winner-Share of party at rank 2. If share WINNER does not have rank 1 then Margin of Share= share of winner - share of party at rank 1.
I am basically trying to get this output-
| year | state |district| margin of share|
|---------------|-------------|--------|----------------|
| 2010 | haryana |kaithal| 9.6|
| 2010 | goa |panji| -40|
| 2010 | up |kaithal| -20|
| 2015 | haryana |kaithal| -35|
I wish to have create a different data frame with columns Year, State, District and margin of SHARE.