8

My data frame looks like below:

Airport ATA Cost Destination Handling Custom Total Cost
PRG 599222 11095 20174 630491
LXU 364715 11598 11595 387908
AMS 401382 23562 16680 441623
PRG 599222 11095 20174 630491

Using below codes it gives a stacked bar chart:

import pandas as pd

# read the sample markdown table on this page with
df = pd.read_html('https://stackoverflow.com/q/51495982/7758804')[0]

# plot columns without Total Cost
df.iloc[:, :-1].plot(x='Airport', kind='barh', stacked=True, title='Breakdown of Costs', mark_right=True)    

enter image description here

How to add the totals (separated by thousands 1,000) over each stacked bar chart? How to add % for each segments in the stacked bar chart?

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
Jack
  • 1,339
  • 1
  • 12
  • 31

2 Answers2

22

You can use plt.text to place the information at the positions according to your data.

However, if you have very small bars, it might need some tweaking to look perfect.

df_total = df['Total Cost']
df = df.iloc[:, 0:4]
df.plot(x = 'Airport', kind='barh',stacked = True, title = 'Breakdown of Costs', mark_right = True)

df_rel = df[df.columns[1:]].div(df_total, 0)*100

for n in df_rel:
    for i, (cs, ab, pc, tot) in enumerate(zip(df.iloc[:, 1:].cumsum(1)[n], df[n], df_rel[n], df_total)):
        plt.text(tot, i, str(tot), va='center')
        plt.text(cs - ab/2, i, str(np.round(pc, 1)) + '%', va='center', ha='center')

enter image description here

EDIT: Some arbitrary ideas for better readability:

shift the total values to the right, use 45° rotated text:

    plt.text(tot+10000, i, str(tot), va='center')
    plt.text(cs - ab/2, i, str(np.round(pc, 1)) + '%', va='center', ha='center', rotation=45)

enter image description here

switch between top- and bottom-aligned text:

va = ['top', 'bottom']
va_idx = 0
for n in df_rel:
    va_idx = 1 - va_idx
    for i, (cs, ab, pc, tot) in enumerate(zip(df.iloc[:, 1:].cumsum(1)[n], df[n], df_rel[n], df_total)):
        plt.text(tot+10000, i, str(tot), va='center')
        plt.text(cs - ab/2, i, str(np.round(pc, 1)) + '%', va=va[va_idx], ha='center')

enter image description here

label only bars with 10% or more:

if pc >= 10:
    plt.text(cs - ab/2, i, str(np.round(pc, 1)) + '%', va='center', ha='center')

enter image description here

...or still print them, but vertical:

if pc >= 10:
    plt.text(cs - ab/2, i, str(np.round(pc, 1)) + '%', va='center', ha='center')
else:
    plt.text(cs - ab/2, i, str(np.round(pc, 1)) + '%', va='center', ha='center', rotation=90)

enter image description here

SpghttCd
  • 10,510
  • 2
  • 20
  • 25
  • thank you. it works very well. is there a way to avoid text overlapping each other? – Jack Jul 25 '18 at 11:02
  • I don't know of an automatism, but perhaps rotating by 45 degree might help as a first approach? – SpghttCd Jul 25 '18 at 14:22
  • if I do not plot the number < 0.1, how to adjust the code? – Jack Jul 26 '18 at 15:39
  • What do you mean - do you want to reduce the digits of precision right to the decimal sign to only print the integer values? – SpghttCd Jul 26 '18 at 15:43
  • I mean i want to hide those text overlapping each other. it's also good to make the % with 0 decimal place – Jack Jul 26 '18 at 15:47
  • you could change `str(np.round(pc, 1))` to `str(int(np.round(pc, 0)))` – SpghttCd Jul 26 '18 at 15:49
  • the more important thing for me here is to hide those small % overlapping each other. – Jack Jul 26 '18 at 15:51
  • honestly: the easiest and most effective way would be simply to plot a (vertical) bar-chart instead of a hbar-chart – SpghttCd Jul 26 '18 at 15:54
  • I have to plot horizontally because i have a big chart. can we conditional plot the text (i,e. plot if the % is greater than 10%) – Jack Jul 26 '18 at 15:56
  • You could even still label smaller values too, but these vertical rotated, so that they fit better in... – SpghttCd Jul 26 '18 at 16:04
0

Data and Imports

import pandas as pd

# load the data and set the x-axis column as the index
df = pd.read_html('https://stackoverflow.com/q/51495982/7758804')[0].set_index('Airport')

# calculate the percent for each row
per = df.iloc[:, :-1].div(df['Total Cost'], axis=0).mul(100).round(2)

Horizontal Bars

# plot
ax = df.iloc[:, :-1].plot(kind='barh', stacked=True, figsize=(10, 6))

ax.legend(bbox_to_anchor=(1, 0.5), loc='center left', frameon=False)

# iterate through the containers
for c in ax.containers:
    
    # get the current segment label (a string); corresponds to column / legend
    label = c.get_label()
    
    # create custom labels with percent
    labels = per[label].astype(str) + '%'
    
    # add the annotation
    ax.bar_label(c, labels=labels, label_type='center', rotation=-90, fontsize=7)

# annotate the top of the bar with the full count
_ = ax.bar_label(ax.containers[-1], label_type='edge', rotation=-90)

enter image description here

Vertical Bars

ax = df.iloc[:, :-1].plot(kind='bar', stacked=True, figsize=(10, 6), rot=0)

ax.legend(bbox_to_anchor=(1, 0.5), loc='center left', frameon=False)

# iterate through the containers
for c in ax.containers:
    
    # get the current segment label (a string); corresponds to column / legend
    label = c.get_label()
    
    # create custom labels with percent
    labels = per[label].astype(str) + '%'
    
    # add the annotation
    ax.bar_label(c, labels=labels, label_type='center', fontsize=7)

# annotate the top of the bar with the full count
_ = ax.bar_label(ax.containers[-1], label_type='edge')

enter image description here

Vertical Bars - Not Stacked

  • It is typically better to not stack, because the relative sizes of segments are easier to compare.
# plot
ax = df.iloc[:, :-1].plot(kind='bar', figsize=(10, 6), rot=0, width=0.85)

ax.legend(bbox_to_anchor=(1, 0.5), loc='center left', frameon=False)

# iterate through the containers
for c in ax.containers:
    
    # get the current segment label (a string); corresponds to column / legend
    label = c.get_label()
    
    # create custom labels with percent
    labels = per[label].astype(str) + '%'
    
    # add the percent
    ax.bar_label(c, labels=labels, label_type='center', fontsize=7)

    # add the count to the top of the bar
    _ = ax.bar_label(c, label_type='edge')

enter image description here

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158