91

I'm trying to stop annotation text overlapping in my graphs. The method suggested in the accepted answer to Matplotlib overlapping annotations looks extremely promising, however is for bar graphs. I'm having trouble converting the "axis" methods over to what I want to do, and I don't understand how the text lines up.

import sys
import matplotlib.pyplot as plt


# start new plot
plt.clf()
plt.xlabel("Proportional Euclidean Distance")
plt.ylabel("Percentage Timewindows Attended")
plt.title("Test plot")

together = [(0, 1.0, 0.4), (25, 1.0127692669427917, 0.41), (50, 1.016404709797609, 0.41), (75, 1.1043426359673716, 0.42), (100, 1.1610446924342996, 0.44), (125, 1.1685687930691457, 0.43), (150, 1.3486407784550272, 0.45), (250, 1.4013999168008104, 0.45)]
together.sort()

for x,y,z in together:
    plt.annotate(str(x), xy=(y, z), size=8)

eucs = [y for (x,y,z) in together]
covers = [z for (x,y,z) in together]

p1 = plt.plot(eucs,covers,color="black", alpha=0.5)

plt.savefig("test.png")

Images (if this works) can be found here (this code):

image1

and here (more complicated):

image2

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
homebrand
  • 1,143
  • 1
  • 9
  • 9
  • Also see http://stackoverflow.com/questions/14938541/how-to-improve-the-label-placement-for-matplotlib-scatter-chart-code-algorithm/15859652#15859652 – tacaswell Sep 29 '13 at 02:35

5 Answers5

200

I just wanted to post here another solution, a small library I wrote to implement this kind of things: https://github.com/Phlya/adjustText An example of the process can be seen here: enter image description here

Here is the example image:

import matplotlib.pyplot as plt
from adjustText import adjust_text
import numpy as np
together = [(0, 1.0, 0.4), (25, 1.0127692669427917, 0.41), (50, 1.016404709797609, 0.41), (75, 1.1043426359673716, 0.42), (100, 1.1610446924342996, 0.44), (125, 1.1685687930691457, 0.43), (150, 1.3486407784550272, 0.45), (250, 1.4013999168008104, 0.45)]
together.sort()

text = [x for (x,y,z) in together]
eucs = [y for (x,y,z) in together]
covers = [z for (x,y,z) in together]

p1 = plt.plot(eucs,covers,color="black", alpha=0.5)
texts = []
for x, y, s in zip(eucs, covers, text):
    texts.append(plt.text(x, y, s))

plt.xlabel("Proportional Euclidean Distance")
plt.ylabel("Percentage Timewindows Attended")
plt.title("Test plot")
adjust_text(texts, only_move={'points':'y', 'texts':'y'}, arrowprops=dict(arrowstyle="->", color='r', lw=0.5))
plt.show()

enter image description here

If you want a perfect figure, you can fiddle around a little. First, let's also make text repel the lines - for that we just create lots of virtual points along them using scipy.interpolate.interp1d.

We want to avoid moving the labels along the x-axis, because, well, why not do it for illustrative purposes. For that we use the parameter only_move={'points':'y', 'text':'y'}. If we want to move them along x axis only in the case that they are overlapping with text, use move_only={'points':'y', 'text':'xy'}. Also in the beginning the function chooses optimal alignment of texts relative to their original points, so we only want that to happen along the y axis too, hence autoalign='y'. We also reduce the repelling force from points to avoid text flying too far away due to our artificial avoidance of lines. All together:

from scipy import interpolate
p1 = plt.plot(eucs,covers,color="black", alpha=0.5)
texts = []
for x, y, s in zip(eucs, covers, text):
    texts.append(plt.text(x, y, s))

f = interpolate.interp1d(eucs, covers)
x = np.arange(min(eucs), max(eucs), 0.0005)
y = f(x)    
    
plt.xlabel("Proportional Euclidean Distance")
plt.ylabel("Percentage Timewindows Attended")
plt.title("Test plot")
adjust_text(texts, x=x, y=y, autoalign='y',
            only_move={'points':'y', 'text':'y'}, force_points=0.15,
            arrowprops=dict(arrowstyle="->", color='r', lw=0.5))
plt.show()

enter image description here

Phlya
  • 5,726
  • 4
  • 35
  • 54
  • 2
    Nice work Phlya! You could probably also add this answer or something like it to https://stackoverflow.com/questions/9074996/matplotlib-how-to-annotate-point-on-a-scatter-automatically-placed-arrow – naught101 Dec 08 '16 at 00:08
  • Thanks, I'm glad you like it! There are a couple other questions on SO which are relevant, but I haven't seen that one... I'll try to find time to write an answer for it, but feel free to do so yourself if you want! – Phlya Dec 08 '16 at 00:13
  • @Phlya: If I run your example I receive the following error: 'float' object has no attribute 'get_position' . Removing the line with adjust_text it works fine. Any ideas what is going wrong? – Ruthger Righart Jun 26 '17 at 13:41
  • 1
    @Phlya: Following your examples at GitHub it works now, looks really splendid! Since others may find your package first visiting SO, I would advice you to update the answer and code. – Ruthger Righart Jun 27 '17 at 11:44
  • 1
    @RuthgerRighart I am glad it works well for you! Thanks for the advice, I think I will do this at some point... – Phlya Jun 27 '17 at 21:59
  • Amazing library, thanks a lot! BTW, it works only if applying at once on ALL texts, it does not work inside a loop (eg: to replace one group after the other, it's not possible, you have to do all texts at once). – gaborous Sep 14 '17 at 00:39
  • @gaborous ooops I missed your comment, sorry - what do you mean? It can work in a loop, but it only takes into account the objects that are given to the function... – Phlya Jan 04 '18 at 22:07
  • @tommy.carstensen did you change the axes limits after calling adjust_text? That would be a problem, it should always be called in the very end. – Phlya Mar 04 '18 at 21:19
  • Great package. I'm trying to use it for different subplots where I type `texts.append(ax2.text(oscmax/numerator, 100, label)) texts.append(ax1.text(oscmax/numerator, 100, label))`. However, when I add `adjust_text(texts,color="C0",only_move={'points':'y'})` at the end, it puts all the text that should be in ax1 into ax2 (and lumps them together). From the examples in your manual page, it seems like it is possible to use it with axes, but not sure what I am doing wrong. – Nigu Jul 01 '18 at 12:20
  • Hi @Nigu, you should call it separately for each subplot, and specify `ax=ax1` or `ax=ax2`. Let me know whether this helps, otherwise please report an issue on github with a reproducible example. – Phlya Jul 01 '18 at 12:25
  • And, obviously, they need to be in separate lists for each subplot. – Phlya Jul 01 '18 at 12:30
  • @Phlya What a prompt response, thank you! Yes, it works perfectly well now. You're my savior. – Nigu Jul 01 '18 at 13:28
  • 2
    If anyone get's the error `'str' object has no attribute 'values'`, please use `adjust_text(texts, only_move={'points':'y', 'texts':'y'}, arrowprops=dict(arrowstyle="->", color='r', lw=0.5))` instead of the syntax in the answer. See https://github.com/Phlya/adjustText/issues/83 – Matthias Arras Mar 04 '20 at 17:42
  • @Phiya, is there a way to wrap `annotate` like you do with `text`? – garej Dec 30 '20 at 22:18
  • @garej sorry for a late reply - no, unfortunately, not. A lot of people are asking for that, and I understand the need, but I haven't figured out a way to do that (and don't have much time to work on the library now). Please feel free to look at the source code and contribute. – Phlya Jan 07 '21 at 19:03
11

Easy solution here: (for jupyter notebooks)

%matplotlib notebook
import mplcursors

plt.plot.scatter(y=YOUR_Y_DATA, x =YOUR_X_DATA)


mplcursors.cursor(multiple = True).connect(
    "add", lambda sel: sel.annotation.set_text(
          YOUR_ANOTATION_LIST[sel.target.index]
))

Right click on a dot to show its anotation.

Left click on an anotation to close it.

Right click and drag on an anotation to move it.

enter image description here

Tomas G.
  • 3,784
  • 25
  • 28
7

With a lot of fiddling, I figured it out. Again credit for the original solution goes to the answer for Matplotlib overlapping annotations .

I don't however know how to find the exact width and height of the text. If someone knows, please post an improvement (or add a comment with the method).

import sys
import matplotlib
import matplotlib.pyplot as plt
import numpy as np

def get_text_positions(text, x_data, y_data, txt_width, txt_height):
    a = zip(y_data, x_data)
    text_positions = list(y_data)
    for index, (y, x) in enumerate(a):
        local_text_positions = [i for i in a if i[0] > (y - txt_height) 
                            and (abs(i[1] - x) < txt_width * 2) and i != (y,x)]
        if local_text_positions:
            sorted_ltp = sorted(local_text_positions)
            if abs(sorted_ltp[0][0] - y) < txt_height: #True == collision
                differ = np.diff(sorted_ltp, axis=0)
                a[index] = (sorted_ltp[-1][0] + txt_height, a[index][1])
                text_positions[index] = sorted_ltp[-1][0] + txt_height*1.01
                for k, (j, m) in enumerate(differ):
                    #j is the vertical distance between words
                    if j > txt_height * 2: #if True then room to fit a word in
                        a[index] = (sorted_ltp[k][0] + txt_height, a[index][1])
                        text_positions[index] = sorted_ltp[k][0] + txt_height
                        break
    return text_positions

def text_plotter(text, x_data, y_data, text_positions, txt_width,txt_height):
    for z,x,y,t in zip(text, x_data, y_data, text_positions):
        plt.annotate(str(z), xy=(x-txt_width/2, t), size=12)
        if y != t:
            plt.arrow(x, t,0,y-t, color='red',alpha=0.3, width=txt_width*0.1, 
                head_width=txt_width, head_length=txt_height*0.5, 
                zorder=0,length_includes_head=True)

# start new plot
plt.clf()
plt.xlabel("Proportional Euclidean Distance")
plt.ylabel("Percentage Timewindows Attended")
plt.title("Test plot")

together = [(0, 1.0, 0.4), (25, 1.0127692669427917, 0.41), (50, 1.016404709797609, 0.41), (75, 1.1043426359673716, 0.42), (100, 1.1610446924342996, 0.44), (125, 1.1685687930691457, 0.43), (150, 1.3486407784550272, 0.45), (250, 1.4013999168008104, 0.45)]
together.sort()

text = [x for (x,y,z) in together]
eucs = [y for (x,y,z) in together]
covers = [z for (x,y,z) in together]

p1 = plt.plot(eucs,covers,color="black", alpha=0.5)

txt_height = 0.0037*(plt.ylim()[1] - plt.ylim()[0])
txt_width = 0.018*(plt.xlim()[1] - plt.xlim()[0])

text_positions = get_text_positions(text, eucs, covers, txt_width, txt_height)

text_plotter(text, eucs, covers, text_positions, txt_width, txt_height)

plt.savefig("test.png")
plt.show()

Creates https://i.stack.imgur.com/xiTeU.png enter image description here

The more complicated graph is now https://i.stack.imgur.com/KJeYW.png, still a bit iffy but much better! enter image description here

Community
  • 1
  • 1
homebrand
  • 1,143
  • 1
  • 9
  • 9
  • and `get_window_extent()` is the artist function that you want – tacaswell Sep 30 '13 at 03:50
  • annotation.get_window_extent() returns Bbox(array([[ 349.194625, 38.0572 ], [ 372.132125, 448.0572 ]])). What does this imply about the width/height of the text? – homebrand Oct 01 '13 at 06:19
  • 2
    That is the bounding box of the text in display units. See http://matplotlib.org/users/transforms_tutorial.html and http://stackoverflow.com/questions/15882249/matplotlib-aligning-y-ticks-to-the-left/15883858#15883858 – tacaswell Oct 01 '13 at 14:49
  • 5
    this code is from http://stackoverflow.com/a/10739207/854988, right? – i think would be nice to credit the original author, @fraxel – deeenes Sep 20 '15 at 01:31
  • 1
    It seems that `a[index][2]` should be replaced by `a[index][1]` in `get_text_positions`. Elements in `a` indeed look to be tuples of size 2. The code doesn't break with the provided example because the concerned part is not executed. – etna May 24 '16 at 10:05
0

Just wanted to add another solution I used in my code.

  1. Get the y axis ticks and find the difference between any 2 consecutive ticks (y_diff).
  2. Annotate first line by adding every "y" element of the graph to a list.
  3. While annotating the second item, check if the annotation of the previous graph (prev_y) for same "x" falls in the same y axis tick range (curr_y).
  4. Annotate only if (prev_y - curr_y) > (y_diff /3) . You can divide the difference by number required by graph size and annotation font size.
 annotation_y_values = []
    for i, j in zip(x, df[df.columns[0]]):
        annotation_y_values.append(j)
        axs.annotate(str(j), xy=(i, j), color="black")
 count = 0
 y_ticks = axs.get_yticks()
 y_diff = y_ticks[-1] - y_ticks[-2]
 for i, j in zip(x, df1[df1.columns[0]]):
        df_annotate_value = annotation_y_values[count]
        current_y_val = j
        diff = df_annotate_value - current_y_val
        if diff > (y_diff/3):
            axs.annotate(str(j), xy=(i, j), color="black", size=8)
        count = count + 1

Dharman
  • 30,962
  • 25
  • 85
  • 135
0

Just created a package for problems like this: textalloc

The following example shows how you might use it in this case. With a few parameter tweaks you can generate a plot like this in a fraction of a second:

import textalloc as ta
import numpy as np
import matplotlib.pyplot as plt

np.random.seed(0)
x_lines = [np.array([0.0, 0.03192317, 0.04101177, 0.26085659, 0.40261173, 0.42142198, 0.87160195, 1.00349979]) + np.random.normal(0,0.03,(8,)) for _ in range(4)]
y_lines = [np.array([0. , 0.2, 0.2, 0.4, 0.8, 0.6, 1. , 1. ]) + np.random.normal(0,0.03,(8,)) for _ in range(4)]
text_lists = [['0', '25', '50', '75', '100', '125', '150', '250'] for _ in range(4)]

texts = []
for tl in text_lists:
    texts += tl
fig,ax = plt.subplots(dpi=100)
for x_line,y_line,text_list in zip(x_lines,y_lines,text_lists):
    ax.plot(x_line,y_line,color="black",linewidth=0.5)
ta.allocate_text(fig,ax,np.hstack(x_lines),np.hstack(y_lines),
            texts,
            x_lines=x_lines, y_lines=y_lines,
            max_distance=0.1,
            min_distance=0.025,
            margin=0.0,
            linewidth=0.5,
            nbr_candidates=400)
plt.show()

enter image description here

ckjellson
  • 31
  • 3