1

I am adding text annotations to points of a scatter plot as follows:

plot

I do not know the length of the text annotations in advance. The figure above shows that the text annotations extend beyond the plot area.

I want to find the length of each text annotation in data coordinates to dynamically resize the plot by setting ylim to fit all the text annotations within the plot area.

I tried three solutions based on the following stack overflow answers.

I tested using two setups:

  • macOS Catalina - 10.15.7

  • Python - 3.8.14

  • matplotlib - 3.3.1

and

  • Windows 11 - 22H2

  • Python - 3.10.2

  • matplotlib - 3.5.1

import matplotlib.pyplot as plt
from matplotlib.transforms import TransformedBbox, Bbox


def get_text_object_height_1(text_obj, _ax):
    # Based on
    # https://stackoverflow.com/questions/24581194/matplotlib-text-bounding-box-dimensions

    # to get the text bounding box we need to draw the plot
    _fig = _ax.get_figure()
    _fig.canvas.draw()

    # get bounding box of the text in the data coordinates
    bb = text_obj.get_window_extent(renderer=_fig.canvas.get_renderer())
    transform = _ax.transData.inverted()
    trans_box = TransformedBbox(bb, transform)

    return trans_box.height


def get_text_object_height_2(text_obj, _ax):
    # Based on
    # https://stackoverflow.com/a/35419796/2912349
    # https://stackoverflow.com/questions/58854335/how-to-label-y-ticklabels-as-group-category-in-seaborn-clustermap/58915100#58915100

    # Must move the plot to see the last update. When the plot is saved, the last update is included.

    # the figure needs to have been drawn once, otherwise there is no renderer
    plt.ion()
    plt.show()
    plt.pause(1)  # Can make the pause smaller. Kept it lager to see the action.

    # get bounding box of the text in the data coordinates
    bb = text_obj.get_window_extent(renderer=_ax.get_figure().canvas.get_renderer())
    transform = _ax.transData.inverted()
    trans_box = TransformedBbox(bb, transform)

    plt.ioff()

    return trans_box.height


def get_text_object_height_3(text_obj, _ax):
    # Based on
    # https://stackoverflow.com/questions/5320205/matplotlib-text-dimensions

    # to get the text bounding box we need to draw the plot
    _fig = _ax.get_figure()

    # get text bounding box in figure coordinates
    renderer = _fig.canvas.get_renderer()
    bbox_text = text_obj.get_window_extent(renderer=renderer)

    # transform bounding box to data coordinates
    trans_box = Bbox(_ax.transData.inverted().transform(bbox_text))

    return trans_box.height


x = [1, 2, 3]
y = [5, 8, 7]
labels = ['mid length', 'short', 'this is a long label']

fig, ax = plt.subplots(dpi=300, figsize=(5, 3))
ax.scatter(x=x, y=y)

y_lim_max = 0
gap = 0.3

for idx, label in enumerate(labels):
    label_y_start = y[idx] + gap
    txt = ax.text(x[idx], label_y_start, label, rotation='vertical', fontdict=dict(color='black', alpha=1, size=8),
                  transform=ax.transData)

    # Can change to _2, _3 to see how other two methods work
    label_length = get_text_object_height_1(txt, ax)

    label_y_end = label_y_start + label_length
    # Just to show the computed label length
    plt.plot([x[idx], x[idx]], [label_y_start, label_y_end])
    y_lim_max = label_y_end if y_lim_max < label_y_end else y_lim_max

print(f'\t{y_lim_max:8.2f}')
plt.ylim(0, y_lim_max)
plt.tight_layout()

plt.show()
# plt.savefig('scaled_plot.png')

However, the computed text annotation lengths are shorter than the text annotations. Please note that I am plotting a line beside each text annotation to demonstrate the calculated length of the respective text annotation.

plot

Am I doing something incorrectly?

Is there a way to get the correct dimensions of the text annotations?

Is there a different approach to resizing the plot area to include all the text annotations?

manujinda
  • 21
  • 4
  • Remove `plt.ylim(0, y_lim_max)`. If you change the shape of the plot after setting everything up, it's going to break. You can set `ax.set_ylim(0, 12)` before, and as long as all the text is inside the spines, the line will be long enough. See [code and plot](https://i.stack.imgur.com/RWJhP.png). The lines will stop at the edge of the spines if the text goes out of the plot area. – Trenton McKinney Apr 27 '23 at 23:53
  • I see your point. With the modification you suggested, I can see that lines span the length of the text label, and the label length calculation seems correct. However, to do that, I should know the maximum y limit (e.g., 12 in this case), which I do not know in advance. "If you change the shape of the plot after setting everything up, it's going to break" makes me feel that what I am after is impossible to achieve. – manujinda Apr 28 '23 at 00:15
  • A [janky](https://www.merriam-webster.com/dictionary/janky) way to do it if you must have the values, is to create a dummy plot with the annotations and get the length, and then use that to set the limit of the "real" plot. – Trenton McKinney Apr 28 '23 at 00:22
  • You can do some janky plotting like this: [code and plot](https://i.stack.imgur.com/tKI0t.png) – Trenton McKinney Apr 28 '23 at 00:47

1 Answers1

1

This is not an answer per se but a demonstration of the impossibility of a clean solution.

Building upon Trenton MaKinney's comments, I created the following iterative solution: the plot is redrawn multiple times until an ok plot is generated.

import matplotlib.pyplot as plt
from matplotlib.transforms import TransformedBbox


def get_text_object_height_1(text_obj, _ax):
    # Based on
    # https://stackoverflow.com/questions/24581194/matplotlib-text-bounding-box-dimensions

    # to get the text bounding box we need to draw the plot
    _fig = _ax.get_figure()
    _fig.canvas.draw()

    # get bounding box of the text in the data coordinates
    bb = text_obj.get_window_extent(renderer=_fig.canvas.get_renderer())
    transform = _ax.transData.inverted()
    trans_box = TransformedBbox(bb, transform)

    return trans_box.height


def get_text_object_height_2(text_obj, _ax):
    # Based on
    # https://stackoverflow.com/a/35419796/2912349
    # https://stackoverflow.com/questions/58854335/how-to-label-y-ticklabels-as-group-category-in-seaborn-clustermap/58915100#58915100

    # Must move the plot to see the last update. When the plot is saved, the last update is included.

    # the figure needs to have been drawn once, otherwise there is no renderer
    plt.ion()
    plt.show()
    plt.pause(1)  # Can make the pause smaller. Kept it lager to see the action.

    # get bounding box of the text in the data coordinates
    bb = text_obj.get_window_extent(renderer=_ax.get_figure().canvas.get_renderer())
    transform = _ax.transData.inverted()
    trans_box = TransformedBbox(bb, transform)

    plt.ioff()

    return trans_box.height


x = [1, 2, 3]
y = [5, 8, 7]
labels = ['mid length', 'short', 'this is a long label']

y_lim_max_new = max(y)
tolerance = 10E-5
gap = 0.3
keep_adjusting = True

iteration = 0

while keep_adjusting:
    y_lim_max_old = y_lim_max_new

    plt.close()
    fig, ax = plt.subplots(dpi=300, figsize=(5, 3))
    ax.scatter(x=x, y=y)
    ax.set_ylim(0, y_lim_max_new)

    for idx, label in enumerate(labels):
        label_y_start = y[idx] + gap
        txt = ax.text(x[idx], label_y_start, label, rotation='vertical', fontdict=dict(color='black', alpha=1, size=8),
                      transform=ax.transData)

        # To watch intermediate plots as they get scaled, change to get_text_object_height_2
        label_length = get_text_object_height_1(txt, ax)

        label_y_end = label_y_start + label_length
        # Just to show the computed label length
        plt.plot([x[idx], x[idx]], [label_y_start, label_y_end])
        y_lim_max_new = label_y_end if label_y_end > y_lim_max_new else y_lim_max_new

    # Check whether y_lim_max_new has "significantly" changed from the previous value
    keep_adjusting = abs(y_lim_max_new - y_lim_max_old) > tolerance

    iteration += 1
    print(f'\t{iteration:3}.\t{y_lim_max_new:8.10f}\t{y_lim_max_new - y_lim_max_old:8.10f}')

plt.tight_layout()

plt.show()
# plt.savefig('scaled_plot.png')

In the system I tested the code, the loop iterated 14 times, meaning it gave 14 different heights for labels.

My intuition of what might be happening

The coordinate system transformation matrices could be dependent on xlim and ylim values. We use the current transformation matrices to compute the label's dimensions and then update ylim. This invalidates the transformation matrices we used to calculate the label dimensions, making them incorrect in the updated coordinate system.

manujinda
  • 21
  • 4