4

With maptplotlib, I plot some points with the scatter method (see code below). I would like to label each point individually.

This code will label every point with the labels array, but I would like my first point to be labeled with labels[0], the second with labels[1] and so on.

import numpy as np; import matplotlib.pyplot as plt
y = np.arange(10) # points to plot
labels = np.arange(10) # labels of the points
fig, ax = plt.subplots(nrows=1, ncols=1)
ax.scatter(x=np.arange(10), y=y, label=labels, picker=3)

Is there any way to do that? And BTW, is there any way to iterate through the points in ax? The method ax.get_children() yields data I don't understand.

Thanks!

Niourf
  • 450
  • 1
  • 4
  • 15
  • 1
    use `annotate`. It lets you add labels + arrows pointing at arbitrary points. – tacaswell Apr 16 '14 at 19:21
  • https://stackoverflow.com/questions/14938541/how-to-improve-the-label-placement-for-matplotlib-scatter-chart-code-algorithm/15859652#15859652 – tacaswell Apr 16 '14 at 19:21
  • Do you want to label each point (that is put a text label in the axes pointing on the point or) do you want a legend entry for each point? – tacaswell Apr 17 '14 at 12:30
  • @tcaswell The point behind all this is to connect the `pick_event` to a callback that will print the label of the artist : `def callback (event) : print event.artist.get_label()` So I guess I just want a legend entry? I hoped there would be one artist for every point but apparently this is not possible unless I use one `scatter` per point (answer below) – Niourf Apr 17 '14 at 14:35
  • 1
    Look at what data the `event` object carries. I _think_ it should carry an index of the point you hit. If so, then just have your call back loop up in your label list. – tacaswell Apr 17 '14 at 15:23
  • You are right, `event.ind` will give me the index of my point. Then I would have to parse the string returned by `get_label()` to get the right label. Not very convenient but it works! Thanks. – Niourf Apr 17 '14 at 15:33
  • See my answer, you can automate almost all of it – tacaswell Apr 17 '14 at 15:46

2 Answers2

4

Assuming that you aren't plotting many scatter points, you could just do a scatter for every point:

import numpy as np; import matplotlib.pyplot as plt
y = np.arange(10) # points to plot
x=np.arange(10)
labels = np.arange(10) # labels of the points
fig, ax = plt.subplots(nrows=1, ncols=1)
for x_,y_,label in zip(x,y,labels):
    ax.scatter([x_], [y_], label=label, picker=3)

This will start to lag if you are plotting many thousands or tens of thousands of points, but if it is just a few, then it is no problem.

To answer the second part of your question, ax.get_children() returns a list of objects that compose those axes, for instance:

[<matplotlib.axis.XAxis at 0x103acc410>,
 <matplotlib.axis.YAxis at 0x103acddd0>,
 <matplotlib.collections.PathCollection at 0x10308ba10>, #<--- this is a set of scatter points
 <matplotlib.text.Text at 0x103082d50>,
 <matplotlib.patches.Rectangle at 0x103082dd0>,
 <matplotlib.spines.Spine at 0x103acc2d0>,
 <matplotlib.spines.Spine at 0x103ac9f90>,
 <matplotlib.spines.Spine at 0x103acc150>,
 <matplotlib.spines.Spine at 0x103ac9dd0>]

If you are just looking to get the sets of scatter points in your axes, the easiest way is through ax.collections. This is a list that contains all the collections instances plotted in the axes (scatter points belong to PathCollection).

In [9]: ax.collections
Out[9]: [<matplotlib.collections.PathCollection at 0x10308ba10>]

If you have plotted a separate scatter for each point, iterating over the points is as easy as:

# iterate over points and turn them all red
for point in ax.collections:
    point.set_facecolor("red") 
ebarr
  • 7,704
  • 1
  • 29
  • 40
  • Thanks for this detailed answer. I suppose then there is no way to iterate through the points if I don't use a separate `scatter` for each point? If so, do you know why? It would seem to me like a pretty useful/not-that-bizarre feature, because I do have thousands of points to plots. – Niourf Apr 17 '14 at 09:01
  • @Niourf I'm not sure, I have looked for ways to do this on several occasions, but I have not had much luck. However, I think a better way to do this might be to use `plot` rather than `scatter`, with `lw=0`. The class used for lines has more flexibility than the `PathCollections` class. – ebarr Apr 17 '14 at 09:42
  • Okay. I use `scatter` rather than `plot` because I need to use some fancy coloring. I am quite new to matplotlib but I tend to find it a lot more complicated than it could be... – Niourf Apr 17 '14 at 10:59
  • @Niourf It is about as complicated as it needs to be (just about anything that looks too complicated at first is that way to allow something else to work). The logical building blocks of an mpl plot are 'artists' (which may know about data), not 'points' (data). iirc `scatter` returns a patch collection object which gives you functions to change the color mapping, marker size, and value (`set_data`). You should know where they are because you still have your `x` and `y` arrays. – tacaswell Apr 17 '14 at 12:39
  • @tcaswell `PathCollections` does not have `set_data` or `get_data` methods. – ebarr Apr 17 '14 at 12:44
  • sorry, should have been `get_array` and `set_array` – tacaswell Apr 17 '14 at 12:50
  • http://stackoverflow.com/questions/18229563/using-networkx-with-matplotlib-artistanimation/18232645#18232645 – tacaswell Apr 17 '14 at 12:52
  • @tcaswell The `set_array` and `get_array` methods only appear to change the colours of the points, not their positions. From the answer you linked, it looks like the logical equivalent of `set_data` is `set_offsets`, correct? – ebarr Apr 17 '14 at 21:58
2

All of this can be wrapped up an hidden in a function or a class:

# import stuff
import matplotlib.pyplot as plt
import numpy as np

# create dictionary we will close over (twice)
label_dict = dict()
# helper function to do the scatter plot + shove data into label_dict
def lab_scatter(ax, x, y, label_list, *args, **kwargs):
    if 'picker' not in kwargs:
        kwargs['picker'] = 3
    sc = ax.scatter(x, y, *args, **kwargs)
    label_dict[sc] = label_list
    return sc
# call back function which also closes over label_dict, should add more sanity checks
# (that artist is actually in the dict, deal with multiple hits in ind ect)
def cb_fun(event):
    # grab list of labels from the dict, print the right one
    print label_dict[event.artist][event.ind[0]]
# create the figure and axes to use
fig, ax = plt.subplots(1, 1)
# loop over 5 synthetic data sets
for j in range(5):
    # use our helper function to do the plotting
    lab_scatter(ax,
                np.ones(10) * j,
                np.random.rand(10),
                # give each point a unique label
                label_list = ['label_{s}_{f}'.format(s=j, f=k) for k in range(10)])
# connect up the call back function
cid = fig.canvas.mpl_connect('pick_event', cb_fun)
tacaswell
  • 84,579
  • 22
  • 210
  • 199