1

I solved my own question after a long and failed search, so I'm posting the question here and the answer immediately below.

The goal: plot percentages but annotate raw counts. The problem: you can annotate the bars with your plotted data (in my case, percentages) by iterating over the axes bar object and calling get_height() to get the text. However if you want to annotate something else, you need to iterate through some separate annotation data at the same time and supply it as annotation text instead. My first solution failed because the separate annotation data, despite being ordered, was assigned to the bars completely out of order (I'd love it if anyone can tell me why):

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

label_freqs = {'Red': 8, 'Orange': 2, 'Yellow': 4, 'Green': 7, 'Blue': 1, 'Indigo': 6, 'Violet': 5}

df = pd.DataFrame(columns=('Colour', 'Frequency', 'Percentage'))
total = sum(label_freqs.values())
df['Colour'] = label_freqs.keys()
df['Frequency'] = [int(val) for val in label_freqs.values()]
df['Percentage'] = [round((int(val)/total)*100, ndigits=2) for val in label_freqs.values()]
df = df.sort_values(by='Frequency', ascending=False)

   Colour  Frequency  Percentage
0     Red          8       24.24
3   Green          7       21.21
5  Indigo          6       18.18
6  Violet          5       15.15
2  Yellow          4       12.12
1  Orange          2        6.06
4    Blue          1        3.03

def autolabel(my_bar, raw_freqs):
    """Attach a text label above each bar in *my_bar*, displaying its height."""
    i = 0
    for point in my_bar:
        height = point.get_height()
        ax.annotate('{}'.format(raw_freqs[i]),
                    xy=(point.get_x() + point.get_width() / 2, height),
                    xytext=(0, 3),  # 3 points vertical offset
                    textcoords="offset points",
                    ha='center', va='bottom', rotation=90)
        i += 1

The solution I found is to zip the axes bar object and the annotation data together and then iterate over it. See the answer, below.

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
KMunro
  • 348
  • 4
  • 14

2 Answers2

2

Here is the solution:

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

fig, ax = plt.subplots()

plt.style.use('seaborn-darkgrid')
x_pos = np.arange(len(df))
ax_bar = ax.bar(x_pos, df['Percentage'], alpha=0.2)

ax.set_title('Colour Frequencies', fontsize=12, fontweight=0)
ax.set_xticks(x_pos)
ax.set_xticklabels(df['Colour'])
for tick in ax.get_xticklabels():
    tick.set_rotation(90)
ax.set_ylabel("Frequency in Percent")

def autolabel(my_bar, raw_freqs):
    """Attach a text label above each bar in *my_bar*, displaying its height."""
    for point, freq in zip(my_bar, raw_freqs):
        height = point.get_height()
        ax.annotate('{}'.format(freq),
                    xy=(point.get_x() + point.get_width() / 2, height),
                    xytext=(0, 3),  # 3 points vertical offset
                    textcoords="offset points",
                    ha='center', va='bottom', rotation=90)


autolabel(ax_bar, df['Frequency'])
plt.tight_layout()
plt.show()
plt.close()

enter image description here

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
KMunro
  • 348
  • 4
  • 14
1
  • From matplotlib 3.4.2, use matplotlib.pyplot.bar_label, and pass the 'Frequency' column to the labels= parameter.
    • See this answer for a thorough explanation and additional examples.
ax = df.plot(kind='bar', x='Colour', y='Percentage', rot=0, legend=False, xlabel='', alpha=0.2)
ax.set_title('Colour Frequencies', fontsize=12, fontweight=0)
ax.set_ylabel("Frequency in Percent")

ax.bar_label(ax.containers[0], labels=df.Frequency, rotation=90, padding=3)
plt.show()

enter image description here

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158