0

I have a small DataFrame (it has only 2 columns: "Gender" whose values are "Male" and "Female" and "MaritalSatus" whose values are "Single", "Married" and "Divorced"). Data distribution is summarized here bellow:

    Gender  MaritalStatus   Tot.
    Male    Single          225
    Male    Married         296
    Male    Divorced        143
    Female  Single          137
    Female  Married         222
    Female  Divorced        94

With the following code I am able to draw a stacked bar plot:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

pclass_xt = pd.crosstab(df["Gender"], df["MaritalStatus"]) 
pclass_xt.plot(kind='bar', stacked=True)
plt.xlabel("Gender")
plt.ylabel("count")
plt.xticks(rotation=0)
plt.show()

Here is my output:

enter image description here

I'd like to add the totals over each stacked bar and the % for each segments in the stacked bar chart. Thanks for the help.

  • There is a very detailed, involved approach here: https://www.w3resource.com/graphics/matplotlib/barchart/matplotlib-barchart-exercise-16.php – sotmot Nov 07 '20 at 10:30
  • The output does not correspond to the code sample you have given. – Patrick FitzGerald Jan 05 '21 at 20:47
  • Does this answer your question? [Display totals and percentage in stacked bar chart using DataFrame.plot](https://stackoverflow.com/questions/51495982/display-totals-and-percentage-in-stacked-bar-chart-using-dataframe-plot) – Patrick FitzGerald Jan 05 '21 at 21:20

1 Answers1

1

The graph can be obtained by using the matplotlib bar function and then adding text. The code for the plot is as follows (I have made the assumption that the data is stored in data.csv file):

import numpy as np
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df = pd.read_csv('data.csv')
display(df)

pclass_xt_group = df.groupby(by=["Gender", "MaritalStatus"]).count()
pclass_xt_group = pclass_xt_group.T
display(pclass_xt_group)

#Get values from the group and categories
groups = ['Female', 'Male']
female = pclass_xt_group['Female'].to_numpy()[0]
male = pclass_xt_group['Male'].to_numpy()[0]

divorced = [pclass_xt_group['Female']['Divorced'][0], pclass_xt_group['Male']['Divorced'][0]]
married = [pclass_xt_group['Female']['Married'][0], pclass_xt_group['Male']['Married'][0]]
single = [pclass_xt_group['Female']['Single'][0], pclass_xt_group['Male']['Single'][0]]

#add colors
colors = ['#FF9999', '#00BFFF','#C1FFC1']

# The position of the bars on the x-axis
r = range(len(groups))
barWidth = 1

#plot bars
plt.figure(figsize=(10,7))
ax1 = plt.bar(r, divorced, color=colors[0], edgecolor='white', width=barWidth, label="divorced")
ax2 = plt.bar(r, married, bottom=np.array(divorced), color=colors[1], edgecolor='white', width=barWidth, label='married')
ax3 = plt.bar(r, single, bottom=np.array(divorced)+np.array(married), color=colors[2], edgecolor='white', width=barWidth, label='single')
plt.legend()

# Custom X axis
plt.xticks(r, groups, fontweight='bold')
plt.ylabel("Count")

for r1, r2, r3 in zip(ax1, ax2, ax3):
    h1 = r1.get_height()
    h2 = r2.get_height()
    h3 = r3.get_height()
    plt.text(r1.get_x() + r1.get_width() / 2., h1 / 2., "%.2f" % (h1/(h1+h2+h3)), ha="center", va="center", color="white", fontsize=16, fontweight="bold")
    plt.text(r2.get_x() + r2.get_width() / 2., h1 + h2 / 2., "%.2f" % (h2/(h1+h2+h3)), ha="center", va="center", color="white", fontsize=16, fontweight="bold")
    plt.text(r3.get_x() + r3.get_width() / 2., h1 + h2 + h3 / 2., "%.2f" % (h3/(h1+h2+h3)), ha="center", va="center", color="white", fontsize=16, fontweight="bold")
plt.show()

The plot obtained is as follows: enter image description here

The essence of the code was inspired from https://medium.com/@priteshbgohil/stacked-bar-chart-in-python-ddc0781f7d5f

sotmot
  • 1,256
  • 2
  • 9
  • 21