I'm generating a scatterplot for a Pandas DataFrame data, containing amongst others the numeric column 'year'
with the unique values
array([2010., 2011., 2012., 2013., 2014., 2015., 2016., 2017., 2018.])
as shown with data.year.unique()
.
Displaying the plot like this:
ax = sns.scatterplot(x='x', y='y', hue='name', size='year', data=data, palette=sns.color_palette('deep', 7))
generates a legend with the groupings for year listed as
This is misleading, as the plot only contains data from 2010 to 2018.
I've tried passing a tuple (min, max) to the sns.scatterplot
function as described in the documentation, to no avail.
Changing the data type of the column 'year' to categoric does print the range of the years correctly in the legend, but yields a legend entry for every single year, which is unnecessary and takes up a lot of space.
I've also tried the solution from this related thread, but it doesn't change the range of the legend entries.
How can I force seaborn to show the actual range of values in the legend? Alternatively, if it only works by using a categorical column, how can I only show every second entry in the legend?