I got the idea to try and visualize data for election donations from the fec website. Basically, I would like to create a stacked bar chart, with the X-axis being the State, Y-axis being the donated amount, and the 'stacks' being the different candidates, showing how much each candidate received from each state.
Code:
import matplotlib.pyplot as plt
import pandas as pd
from pathlib import Path
pathName = r"R:\Downloads\indiv20\by_date"
dataDir = Path(pathName)
filename = "itcont_2020_20010425_20190425.txt"
fullName = dataDir / filename
data = pd.read_csv(fullName, low_memory=False, sep="|", usecols=[0, 9, 12, 14])
data.columns = ['Filer ID', 'State', 'Occupation', 'Donation Amount ($)']
data = data.dropna(subset=['Donation Amount ($)'])
donations_by_state = data.groupby('State').sum()
plt.bar(donations_by_state.index, donations_by_state['Donation Amount ($)'])
plt.ylabel('Donation Amount ($)')
plt.xlabel('State')
plt.title('Donations per State')
plt.show()
This plots the total contributions per state, and works great. However, when I try this groupby method to group all the data I want, I'm not sure how to plot a stacked bar chart from this data:
donations_per_candidate_per_state = data['Donation Amount ($)'].groupby([data['State'], data['Filer ID']]).sum()
State Filer ID
AA C00005561 350
C00010603 600
C00042366 115
C00309567 1675
C00331694 2500
C00365536 270
C00401224 4495
C00411330 100
C00492991 300
C00540500 300
C00641381 250
C00696948 2800
C00697441 250
C00699090 67
C00703108 1400
AB C00401224 1386
AE C00000935 295
C00003418 276
C00010603 1750
C00027466 320
C00193433 105
C00211037 251
C00216614 226
C00341396 20
C00369033 150
C00394957 50
C00401224 26538
C00438713 50
C00457325 310
C00492785 300
...
ZZ C00580100 1490
C00603084 95
C00607861 750
C00608380 125
C00618371 2199
C00630665 1000
C00632133 600
C00632398 400
C00639500 208
C00639591 1450
C00640623 6402
C00653816 1000
C00666149 1000
C00666453 2800
C00683102 1000
C00689430 3524
C00693234 13283
C00693713 1000
C00694018 2750
C00694455 12761
C00695510 1045
C00696245 250
C00696419 3000
C00696526 500
C00696948 31296
C00697441 34396
C00698050 350
C00698258 2800
C00699090 5757
C00700732 475
Name: Donation Amount ($), Length: 32662, dtype: int64
It seems to have the data tabulated in the way I need, just not sure how to plot it.