1

I'm working on putting together a budgeting program and the end result will be a bar chart with multiple bars that shows the total spent along 6 categories by date. So far, I've been able to clean up the data to show the date, category, and spend (pulling from a Google sheets workbook):

def process_data(worksheet):
   data = worksheet.get_all_records()
   all_cost = []
   for i in range(len(data)):
       date = data[i].get('Timestamp').split()
       cost = data[i].get("Cost")
       category = data[i].get("Category")
       newdict = {"Date": date[0], "Spend": cost, "Category":category}
       all_cost.append(newdict)
   df = pd.DataFrame(all_cost)

The output dataframe looks like this:

         Date   Spend               Category
0   11/25/2020   54.32                Grocery
1   11/25/2020   49.77               Projects
2   11/25/2020   34.22                    Gas
3   11/25/2020   46.00                Grocery
4   11/27/2020   37.00  Restaurant/Eating Out
5   11/27/2020   72.00                Grocery
6   11/27/2020  129.31               Projects
7   11/27/2020   32.00                  Bills
8   11/27/2020   69.42                Grocery
9   11/28/2020   12.69                    Gas
10  11/28/2020   69.69  Restaurant/Eating Out
11  11/28/2020    6.66                  Other
12  11/28/2020  444.21               Projects
13  11/29/2020   73.00                Grocery

Then I added:

new_df = df.groupby(["Date", "Category"])["Spend"].sum().reset_index()

in order to sum the total categorical spend per day:

     Date               Category   Spend
0   11/25/2020                    Gas   34.22
1   11/25/2020                Grocery  100.32
2   11/25/2020               Projects   49.77
3   11/27/2020                  Bills   32.00
4   11/27/2020                Grocery  141.42
5   11/27/2020               Projects  129.31
6   11/27/2020  Restaurant/Eating Out   37.00
7   11/28/2020                    Gas   12.69
8   11/28/2020                  Other    6.66
9   11/28/2020               Projects  444.21
10  11/28/2020  Restaurant/Eating Out   69.69

I want to add a bar chart with matplotlib to show the breakdown of dollars spent per category per day, however I'm not sure how to do this since the quantity of recorded categories is different depending on the day.

Would the best way to go about doing this be to append a row with "0" for each category with no spend? Or is there an easier way to plot this? Thanks!

Edit:

I realized that the way I had structured my dataframe was not ideal. I think it will be easier to achieve what I want to do by structuring the data as follows in the sample:

df_test = pd.DataFrame({"11/25/2020": {"Gas":34.22, "grocery": 69.68, "Bills":15, "Gas": 10}, "11/27/2020": {"Gas": 16, "Projects": 144, "Bills":20}})
print(df_test)

to achieve the outcome:

              11/25/2020  11/27/2020
Gas            10.00        16.0
grocery        69.68         NaN
Bills          15.00        20.0
Projects         NaN       144.0

However, I'm not quite sure how to iterate lines of all of my data to achieve the desired output. Additionally, I know that I can't use a dictionary for this because it will only be using the last set of key/value, while I want to sum the amounts for each category.

Should I use tuples? How would I iterate through each line of data and sum the tuple values for each category? Thanks!

jspitz
  • 11
  • 2
  • How do you want to plot this information? You could do subplots, one for each day, or you could group a bunch of bars for one day, then a bit of space, then the bars of the next day, etc. Or maybe some other way? And why not have all 6 categories for each day? That way you can visually compare more easily how much you spend on each across different days, even if one day it's zero. – Reti43 Nov 30 '20 at 00:33
  • I want to group all categories of bars for each day on the same plot. The plot in the first answer [here](https://stackoverflow.com/questions/59066811/how-can-a-plot-a-5-grouped-bars-bar-chart-in-matplotlib) is similar to how I want mine to look. This is where I'm stuck though, since for some days, some categories don't have entries since the value would be zero. Should I append rows to show the zeroes in the actual data, or is there a better way to plot only the categories present for that day? – jspitz Dec 01 '20 at 04:56

1 Answers1

0

In order to plot multiple matplotlib bar charts on the same plot, the ideal way is to make use of the concept of subplots. Shown below is a short example that I have worked on using random data but I am sure you can feed in your own dataset and plot likewise:

Code:

fig2, ax = plt.subplots(2,2,sharex='col',sharey='row')
x = np.arange(1,11)
y = np.arange(1,11)
z = np.random.randn(10)
 
#Subplot1
ax[0][0].plot(x,y,color='g',marker='p',label='axes1')
ax[0][0].set_title('Axes1')
ax[0][0].legend()
 
#Subplot2
ax[0][1].scatter(x,y,color='k',marker='^',label='axes2')
ax[0][1].set_title('Axes2')
ax[0][1].legend()
 
#Subplot3
ax[1][0].plot(x,z,marker='v',label='axes3')
ax[1][0].set_title('Axes3')
ax[1][0].legend()
 
#Subplot4
ax[1][1].scatter(x,z,marker='o',label='axes4')
ax[1][1].set_title('Axes4')
ax[1][1].legend()
 
plt.tight_layout()
plt.show()

Output;

Doing so you will realize that the subplots having varying lengths of data(see the x and y axes) in the output image are being plotted as follows:

Output Image

aryashah2k
  • 638
  • 1
  • 4
  • 16
  • Sorry - maybe I wasn't super clear. I want all categories of bars to be plotted on the same graph for each day, similar to the top answer [here](https://stackoverflow.com/questions/59066811/how-can-a-plot-a-5-grouped-bars-bar-chart-in-matplotlib). My problem is the data size differs each day based on which categories get logged, and I'm not sure how to go about plotting these all together. – jspitz Dec 01 '20 at 04:58