0

The following code produces a seaborn pairplot.

How can I achieve that the red point (with b = 10.) is visible in the subplot c/a (left bottom)?

Presently it is almost invisible as the points with b = 4 and b = 5 seem to be plotted afterwards and hide it.

Sorting the DataFrame unfortunately does not help.

enter image description here

import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

def supplyHueByB(x, bMax):
    amountOfSegments = 8
    myReturn = int(x * amountOfSegments / bMax)
    return myReturn

myList = [
    [0.854297, 1.973376, 0.187038],
    [0.854297, 2.204028, 0.012476],
    [0.854297, 10.0, 0.056573],
    [0.854297, 5.0, 0.050635],
    [0.854297, 4.0, 0.058926]
]
df = pd.DataFrame(myList)
df.columns=['a', 'b', 'c']
bMax = df.b.max()
hue = df.b.apply(lambda x: supplyHueByB(x, bMax))

g = sns.pairplot(
    df,
    corner=True,
    diag_kws=dict(color=".6"),
    vars=['a', 'b', 'c'],
    plot_kws=dict(
        hue=hue,
        palette="coolwarm",
        edgecolor='None',
        s=80  # size
    ),
)

plt.subplots_adjust(bottom=0.1)
g.add_legend()
plt.show()
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
7824238
  • 388
  • 4
  • 14

2 Answers2

2

df and hue have to be sorted in tandem:

>>> g = sns.pairplot(
...     df.sort_values('b'),
...     corner=True,
...     diag_kws=dict(color=".6"),
...     vars=['a', 'b', 'c'],
...     plot_kws=dict(
...         hue=sorted(hue),
...         palette="coolwarm",
...         edgecolor='None',
...         s=80  # size
...     ),
... )

The above produces the desired output, i.e., the red point is plotted after the light blue ones. In this example, using sort_values and sorted does the trick. For a custom order for plotting the points, one may need to be more creative, but the key principle remains that the ordering of the df should be consistent to that of hue.

Nikolaos Chatzis
  • 1,947
  • 2
  • 8
  • 17
0

I was surprised to see that Seaborn does not perform the layering (i.e., which point goes above another point) in the order that the points are passed, since Matplotlib definitely does that, and Seaborn is built atop Matplotlib.

Following Matplotlib's ordering, you would want the point [a=0.854297, c=0.056573] (i.e., the point being hidden) to be plotted after the other two points close to it [a=0.854297, c=0.050635] and [a=0.854297, c=0.058926]. This is so that [a=0.854297, c=0.056573] is plotted last and hence not masked.

Since Seaborn does not seem to do this out of the box, I reordered [a=0.854297, c=0.056573] to be plotted last.

# layer_orders is the order (first to last) in which we want the points to be plotted.
layer_order = [0, 1, 3, 4, 2]

# Extracting df['a'] and df['c'] in the order we want.
a = df['a'][layer_order]
c = df['c'][layer_order]

# Highlighting the last point in red to show it is not hidden.
colors = ['blue'] * 4 + ['red']

# Axis 3 is where we have the problem. Clearing its contents first.
g.figure.axes[3].clear()
g.figure.axes[3].scatter(a, c, color=colors)

This will give you a plot that looks like this: As desired, the point in red is not hidden behind the points in blue.

You might want to refactor the code to be better, but I hope this gives you the underlying idea.

[I have plotted the points in blue and red, but you can change them to the hex values that you like to match the other Seaborn plots.]

Nikhil Kumar
  • 1,015
  • 1
  • 9
  • 14