0

I am trying to create a stacked bar plot. The code that I provide is pretty much what I want to do, i.e. have 12 x-values, one for each month, with 4 bars on each group of bars, one for every year. The only difference that I am trying to achieve is to have the bars stacked, with different colors coming from a color dictionary, one for each PC value. So the final plot, should have the exact number of bars, on the same exact positions, but each bar should be a stacked version of the current one. Is this even possible? If that's not possible and you have other ideas of how someone could visually represent the change and amount of the PC values for each month, each year, I would be glad to hear your suggestions

import pandas as pd
import random
import matplotlib.pyplot as plt

N=100
pc_col = [random.randint(1,7) for i in range(N)]
year_col = [random.randint(2020,2023) for i in range(N)]
month_col = [random.randint(1,12) for i in range(N)]

data = {'PC': pc_col, 'Year': year_col, 'Month': month_col}
data = pd.DataFrame(data)

counts = data.groupby(['Month', 'Year']).size().unstack()
counts.plot(kind='bar')

plt.xlabel('Month')
plt.ylabel('Count')
plt.title('Counts by Month and Year')

plt.show()

I have tried the stacked parameter with different group_by methods, but it doesn't seem to be actually doing what I want. I also have found this Pandas: Stacked bar plot with adjacent bars but in that way, the x-tick labels are misplaced. I have also found this Multiple stacked bar plot with pandas but in my case I don't know by default how many years the dataset will have

TheEngineerProgrammer
  • 1,282
  • 1
  • 4
  • 9
  • What do you mean by "different colors coming from a color dictionary, one for each PC value"? The x-axis will contain 12 values for 12 months. The stacked bars will be a different color for each year (4 in above example for 2020-23). These 4 (or 5 or up to 7) can be assigned to each YEAR, not each PC value, as there is no correlation between the bars and the values. Is that what you are trying to do? – Redox Apr 06 '23 at 11:40
  • @Redox If for example for January of 2023 we have 1 row with PC_val = 1, 3 rows with PC_val= 2 and 7 rows with PC_val=3, then that bar corresponding to January of 2023 would have a colored box of height=1 for the first value, an other colored box of height=3 for the second value and an other colored box of height=7 for the third value. Those will be one on top of each other to create the full bar with a total height of 11. If you run the code I provided it makes more sense. Same thing, just each bar will have colored boxes in it depending on the PC_value (stacked) – Warehouse_Worker Apr 06 '23 at 12:09
  • So, you are expecting 12 x axis entries for each month, 4 bars for each year (as it is currently shown) AND each of those 4 bars are shown as stacked bars based on PC counts and color. Is that what you are expecting? – Redox Apr 06 '23 at 12:53
  • @Redox yes pretty much – Warehouse_Worker Apr 06 '23 at 13:21

1 Answers1

0

I am not totally sure, but see if this is what you are looking for...

  1. I have put the whole bar chart in a FOR loop, with each loop for an year
  2. While grouping, use PC along with Month and year
  3. Drop the Year from the index (index 1 using xs)
  4. Plot this bar (one plot for each year)
  5. Adjust width and position based on 1/number of years
  6. Read legend during first run and replace with just this (no repeats)

I think am assuming the first year has data for all PC values, but other than that, hopefully that should work. Hope this helps...

N=100
pc_col = [random.randint(1,7) for i in range(N)]
year_col = [random.randint(2020,2023) for i in range(N)]
month_col = [random.randint(1,12) for i in range(N)]
color=['red', 'green', 'blue', 'yellow', 'black', 'orange', 'magenta']
data = {'PC': pc_col, 'Year': year_col, 'Month': month_col}
data = pd.DataFrame(data)
fig, ax = plt.subplots()
barwidth=0.5/data.Year.nunique()
for i, yr in enumerate(sorted(data.Year.unique())):
    data.groupby(['Month', 'Year', 'PC']).size().xs(yr, level=1, drop_level=True).unstack().\
    plot(ax=ax, kind='bar', stacked='True', width=barwidth, position=i)
    if i == 0:
        h,l = ax.get_legend_handles_labels()

ax.get_legend().remove()
ax.legend(h,l)

enter image description here

Redox
  • 9,321
  • 5
  • 9
  • 26