I struggle with customizing the legend of my scatterplot. Here is a snapshot :
And here is a code sample :
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
sns.set()
my_df = pd.DataFrame([[5, 3, 1], [2, 1, 2], [3, 4, 1], [1, 2, 1]],
columns=["DUMMY_CT", "FOO_CT", "CI_CT"])
g = sns.scatterplot("DUMMY_CT", "FOO_CT", data=my_df, size="CI_CT")
g.set_title("Number of Baz", weight="bold")
g.set_xlabel("Dummy count")
g.set_ylabel("Foo count")
g.get_legend().set_title("Baz count")
Also, I work in a Jupyter-lab notebook with Python 3, if it helps.
The red thingy issue
First things first, I wish to hide the name of the CI_CT
variable (contoured in red on the picture). After exploring the whole documentation for this afternoon, I found the get_legend_handlers_label
method (see here), which produces the following :
>>> g.get_legend_handles_labels()
([<matplotlib.collections.PathCollection at 0xfaaba4a8>,
<matplotlib.collections.PathCollection at 0xfaa3ff28>,
<matplotlib.collections.PathCollection at 0xfaa3f6a0>,
<matplotlib.collections.PathCollection at 0xfaa3fe48>],
['CI_CT', '0', '1', '2'])
Where I can spot my dear CI_CT
string. However, I'm unable to change this name or to hide it completely. I found a dirty way, that basically consists in not using efficiently the dataframe passed as a data
parameter. Here is the scatterplot
call :
g = sns.scatterplot("DUMMY_CT", "FOO_CT", data=my_df, size=my_df["CI_CT"].values)
Result here :
It works, but is there a cleaner way to achieve this?
The green thingy issue
Displaying a 0
level in this legend is incorrect, since there is no zero value in the column CI_CT
of my_df
. It is therefore misleading for the readers, who might assume the smaller dots represents a value of 0 or 1. I wish to setup a defined scale, in the way one can do it for the x and y axis. However, I cannot achieve it. Any idea?
TL;DR : A broader question that could solve everything
Those adventures make me wonder if there is a way to handle the data you can pass to the scatterplots with hue
and size
parameters in a clean, x-and-y-axis way. Is it actually possible?
Please pardon my English, please let me know if the question is too broad or uncorrectly labelled.