2

I'm trying to plot a multi-dimensional scatterplot across several visual properties (facets, hue, shape, x, y). I'm also trying to get a tooltip on cursor hover to show additional properties of the point. (I'm using seaborn + mplcursors, but I'm not married to this solution.) The problem is that the hover has the wrong index in the dataset and displays the wrong information. You can see the same in the following toy example assembled from two examples from the seaborn and mplcursors websites.

I believe I've diagnosed the issue to the cursor.connect() not returning the proper index in the dataframe. I can get this example to work if I reduce the number of modifiers (hue, col, row, etc), but it doesn't work with all of these included.

import seaborn as sns
import matplotlib.pyplot as plt
import mplcursors


df = sns.load_dataset("tips")

sns.relplot(data=df, x="total_bill", y="tip", hue="day", col="time", row="sex")


def show_hover_panel(get_text_func=None):
    cursor = mplcursors.cursor(
        hover=2,  # Transient
        annotation_kwargs=dict(
            bbox=dict(
                boxstyle="square,pad=0.5",
                facecolor="white",
                edgecolor="#ddd",
                linewidth=0.5,
            ),
            linespacing=1.5,
            arrowprops=None,
        ),
        highlight=True,
        highlight_kwargs=dict(linewidth=2),
    )

    if get_text_func:
        cursor.connect(
            event="add",
            func=lambda sel: sel.annotation.set_text(get_text_func(sel.index)), # <- this doesn't appear to return the correct integer index in the dataframe
        )

    return cursor


def on_add(index):
    item = df.iloc[index] 
    parts = [
        f"total_bill: {item.total_bill}",
        f"tip: {item.tip}",
        f"day: ${item.day}",
        f"time: ${item.time}",
        f"sex: ${item.sex}",
    ]

    return "\n".join(parts)


show_hover_panel(on_add)

plt.show()

example of issue

What I tried:

  • minimum viable example
  • removing modifiers = works
  • traced back the correct point locations based on the data BUT when I pass the index to the tooltip I notice that the index doesn't correspond to the proper index in he dataframe.
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
diyer0001
  • 33
  • 4
  • 1
    The ith point drawn on a given facet is not going to correspond to the ith row in a dataframe. – mwaskom Aug 29 '23 at 10:55
  • Thanks for the comment. Is there a way (sorting or otherwise) to make them correspond? – diyer0001 Aug 29 '23 at 12:57
  • 1
    As shown [here](https://stackoverflow.com/a/61337574/7758804), `mplcursors` works fine to show x and y with a multi-faceted `relplot`. However, you need a way to also filter by the `col`, and `row`, to correctly select the extra data from the dataframe. – Trenton McKinney Aug 29 '23 at 15:36
  • Yes and I don't believe that `mplcursors` is aware of that information directly, sadly. – diyer0001 Aug 29 '23 at 17:23
  • Hopefully the answer is helpful. Thoroughly answering questions is time-consuming. If your question is **solved**, please _**accept** the solution_. The **✔** is below the **▲/▼** arrow, at the top left of the answer. A new solution can be accepted if a better one shows up. You may also vote on the usefulness of an answer with the **▲/▼** arrow. **Leave a comment if a solution doesn't answer the question.** [What should I do when someone answers my question?](https://stackoverflow.com/help/someone-answers). – Trenton McKinney Aug 30 '23 at 15:47

1 Answers1

1

sns.relplot returns a FacetGrid which contains an axes_dict. That's a dictionary that for each column and row tells which is the corresponding subplot (ax). Based on this, you can create a new dictionary that maps the ax to the corresponding subset of the dataframe. (Note that this might occupy a lot of extra memory for a large dataframe.)

The selected artist in mplcursors keeps a reference to the subplot (set.artist.axes) which can be used as a key in the new dictionary.

Here is how the example could look like. The annotation function is now larger, so it needs its own function.

import seaborn as sns
import matplotlib.pyplot as plt
import mplcursors

df = sns.load_dataset("tips")

g = sns.relplot(data=df, x="total_bill", y="tip", hue="day", col="time", row="sex")

# create a dictionary mapping subplots to their corresponding subset of the dataframe
subplot_df_dict = dict()
for (sex, time), ax in g.axes_dict.items():
    subplot_df_dict[ax] = df[(df['sex'] == sex) & (df['time'] == time)].reset_index(drop=True)

def show_annotation(sel):
    ax = sel.artist.axes
    item = subplot_df_dict[ax].iloc[sel.index]
    parts = [
        f"total_bill: {item.total_bill}",
        f"tip: {item.tip}",
        f"day: ${item.day}",
        f"time: ${item.time}",
        f"sex: ${item.sex}",
    ]
    sel.annotation.set_text("\n".join(parts))

def show_hover_panel(show_annotation_func=None):
    cursor = mplcursors.cursor(
        hover=2,  # Transient
        annotation_kwargs=dict(
            bbox=dict(
                boxstyle="square,pad=0.5",
                facecolor="white",
                edgecolor="#ddd",
                linewidth=0.5,
            ),
            linespacing=1.5,
            arrowprops=None,
        ),
        highlight=True,
        highlight_kwargs=dict(linewidth=2),
    )
    if show_annotation_func is not None:
        cursor.connect(
            event="add",
            func=show_annotation_func
        )
    return cursor

show_hover_panel(show_annotation)
plt.show()

seaborn facetgrid with mplcursors tooltip

JohanC
  • 71,591
  • 8
  • 33
  • 66