I am trying to plot the distribution within a couple of dataframes I have. Doing it manually I get the result I am looking for:
#creating a dataframe
r = [0,1,2,3,4]
raw_data = {'greenBars': [20, 1.5, 7, 10, 5], 'orangeBars': [5, 15, 5, 10, 15],'blueBars': [2, 15, 18, 5, 10]}
df = pd.DataFrame(raw_data)
# From raw value to percentage
totals1 = list(df.sum(axis=1))
greenBars = [i / j * 100 for i,j in zip(df['greenBars'], totals)]
orangeBars = [i / j * 100 for i,j in zip(df['orangeBars'], totals)]
blueBars = [i / j * 100 for i,j in zip(df['blueBars'], totals)]
# plot
barWidth = 0.85
names = ('A','B','C','D','E')
# Create green Bars
plt.bar(df.index, greenBars, color='#b5ffb9', edgecolor='white', width=barWidth, label="group A")
# Create orange Bars
plt.bar(r, orangeBars, bottom=greenBars, color='#f9bc86', edgecolor='white', width=barWidth, label="group B")
# Create blue Bars
plt.bar(r, blueBars, bottom=[i+j for i,j in zip(greenBars, orangeBars)], color='#a3acff', edgecolor='white', width=barWidth, label="group C")
# Custom x axis
plt.xticks(r, names)
plt.xlabel("group")
# Add a legend
plt.legend(loc='upper left', bbox_to_anchor=(1,1), ncol=1)
# Show graphic
plt.show()
However I have to do this for multiple dataframes with more than just a few columns and would like to make a loop out of it. If have been able to draw the first bar completly but the other bars are incomplete with this code:
#Same df as above
for column in df:
placeholder = [i / j * 100 for i,j in zip(df[column], totals)]
print(f'placeholder of {column}')
print(placeholder)
barWidth = 0.85
names = ('A','B','C','D','E')
# Create green Bars
plt.bar(df.index, placeholder, edgecolor='Black', width=barWidth, label = f"{column}")
Does anyone know how to fix this?
I tried creating the loop myself but the bars keept beeing incomplete.