3

I'm trying to create a stacked bar chart in MatPlotLib with two distinct x-labels on top and bottom. The upper one is supposed to have a bounding box with the same width as the bars themselves.

Plot that's not quite right

This is how I create the labels:

plt.tick_params(axis="both", left=False, bottom=False, labelleft=False)
plt.xticks(ind, diagram.keys())
ax.set_frame_on(False)

for label, x in zip([q[1] for q in diagram.values()], ind):
    ax.text(
        x, 1.05, '{:4.0%}'.format(label), 
        ha="center", va="center", 
        bbox={"facecolor": "blue", "pad": 3}
    )

diagram is a dictionary like {bottom-label: [[contents], top-label]}

So I guess my question boils down to: How do I manipulate the bounding boxs of the text objects?

Thanks a lot!

As per request, a runnable example:

import matplotlib.pyplot as plt
import numpy as np


def stacked_bar_chart(
        diagram, title="example question", img_name="test_image", width=0.7, clusters=None, show_axes=True,
        show_legend=True, show_score=True):
    """
    Builds one or multiple scaled stacked bar charts for grade
    distributions. saves image as png.
    :param show_score: whether the score should be shown on top
    :param show_legend: whether the legend should be shown
    :param show_axes: whether question name should be shown on bottom
    :param clusters: indices of clusters to be displayed.
    :param width: the width of the bars as fraction of available space
    :param title: diagram title
    :param img_name: output path
    :param diagram: dictionary: {x-label: [[grade distribution], score]}
    :return: nothing.
    """

    grades = {
        "sehr gut":     "#357100",
        "gut":          "#7fb96a",
        "befriedigend": "#fdd902",
        "ausreichend":  "#f18d04",
        "mangelhaft":   "#e3540e",
        "ungenügend":   "#882d23"
    }

    # select clusters
    if clusters is not None:
        diagram = {i: diagram[i] for i in clusters}

    # normalize score distribution => sum of votes = 1.0
    normalized = []
    for question in diagram.values():
        s = sum(question[0])
        normalized.append([x / s for x in question[0]])

    # transpose dict values (score distributions) to list of lists
    transformed = list(map(list, zip(*normalized)))

    # input values for diagram generation
    n = len(diagram)  # number of columns
    ind = np.arange(n)  # x values for bar center
    base = [0] * n  # lower bounds for individual color set
    bars = []
    fig, ax = plt.subplots()

    # loop over grades
    for name, grade in zip(grades.keys(), transformed):
        assert len(grade) == n, \
            "something went wrong in plotting grade stack " + img_name
        bar = plt.bar(ind, grade, width=width, color=grades[name], bottom=base)
        bars.append(bar)

        # loop over bars
        for i, (rect, score) in enumerate(zip(bar, grade)):
            # update lower bound for next bar section
            base[i] += grade[i]
            # label with percentage
            # TODO text color white
            ax.text(
                rect.get_x() + width / 2, rect.get_height() / 2 + rect.get_y(), "{0:.0f}%".format(score * 100),
                va="center", ha="center")

    # label diagram

    plt.suptitle(title)
    if show_axes:
        plt.tick_params(axis="both", left=False, bottom=False, labelleft=False)
        plt.xticks(ind, diagram.keys())
        ax.set_frame_on(False)

    else:
        plt.tick_params(axis="both", left=False, bottom=False, labelleft=False, labelbottom=False)
        plt.axis("off")

    # show score label above
    if show_score:
        for label, x in zip([q[1] for q in diagram.values()], ind):
            ax.text(
                x, 1.05, '{:4.0%}'.format(label),
                ha="center", va="center",
                bbox={"facecolor": "blue", "pad": 3}
            )

    # create legend
    if show_legend:
        plt.legend(
            reversed(bars), reversed([*grades]),
            bbox_to_anchor=(1, 1), borderaxespad=0)

    # save file
    plt.show()


diagram = {
    "q1": [[1, 2, 3, 4, 5, 6], 0.6],
    "q2": [[2, 3, 1, 2, 3, 1], 0.4]
}
stacked_bar_chart(diagram)
azrael
  • 100
  • 1
  • 11
  • 1
    This is rather cumbersome, see [this question](https://stackoverflow.com/questions/40796117/how-do-i-make-the-width-of-the-title-box-span-the-entire-plot), which wants to do this for the title box. Of course one may adapt it for the barwidth. On the other hand, one may rather put some text to the axes and draw a rectangle of the required dimensions in the background of it. In which direction would you like to go? – ImportanceOfBeingErnest Dec 02 '17 at 19:07
  • I'd prefer the former variant. I had already explored the option of drawing rectangles but failed for a reason I can't remember... then again, I'm the one asking for advice so I'm open for whatever you think best. – azrael Dec 02 '17 at 19:49
  • I can see what I can do, but would you mind providing a [mcve] (don't use your real data, just something I can copy, paste and run). – ImportanceOfBeingErnest Dec 02 '17 at 20:01
  • I tried. It's still a bit much, the interesting part is near the comment `# show score label above` – azrael Dec 02 '17 at 20:39
  • That is no [mcve]. I created my own to answer the question. Mind that you cannot expect people to this work for you. – ImportanceOfBeingErnest Dec 02 '17 at 23:51

1 Answers1

5

For arguments why setting the width of a text box to a defined width is hard see this question which is about setting the title text box width. In principle the answer over there could be used here as well - making this rather complicated.

A relatively easy solution would be to specify the x-position of the text in data coordinates and its y position in axes coordinates. This allows to create a rectangle as background for the text with the same coordinates such that it looks like a bounding box of the text.

import matplotlib.pyplot as plt
import numpy as np

ind = [1,2,4,5]
data = [4,5,6,4]
perc = np.array(data)/float(np.array(data).sum())
width=0.7 
pad = 3 # points


fig, ax = plt.subplots()
bar = ax.bar(ind, data, width=width)

fig.canvas.draw()
for label, x in zip(perc, ind):
    text = ax.text(
        x, 1.00, '{:4.0%}'.format(label),
        ha="center", va="center" , transform=ax.get_xaxis_transform(), zorder=4)
    bb= ax.get_window_extent()
    h = bb.height/fig.dpi
    height = ((text.get_size()+2*pad)/72.)/h
    rect = plt.Rectangle((x-width/2.,1.00-height/2.), width=width, height=height,
                         transform=ax.get_xaxis_transform(), zorder=3,
                         fill=True, facecolor="lightblue", clip_on=False)
    ax.add_patch(rect)


plt.show()

enter image description here

ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712
  • Thank you so much! that did it. Though for some reason, probably because I switched to creating the actual bars with pandas in the meantime, I had to use `ax = plt.gca()` and `fig = plt.gcf()` instead of `plt.subplots()` or it would have created to seperate plots. – azrael Dec 03 '17 at 20:25