4

For a seaborn swarmplot, I'd like to give different points different marker sizes.

In the snippet below, for example, I want to use the marker_size key in plot_data to specify the size of the points in the swarmplot. According to the seaborn documentation, swarmplot takes a size parameter, but it has to be a float, so I can't use it for what I want to do.

Here is my code:

import seaborn as sns
import numpy as np
N = 100
plot_data = dict(category=np.random.choice([1, 2, 3], size=N), 
                 values=np.random.randn(N), 
                 marker_size=np.arange(N))
sns.swarmplot(x="category", y="values", data=plot_data)

Does anyone know what I can do to specify different point sizes for the swarmplot?

Marcus Campbell
  • 2,746
  • 4
  • 22
  • 36
tkunk
  • 1,378
  • 1
  • 13
  • 19

2 Answers2

2

As per @mwaskom's comment, this is not possible in Seaborn.

joelostblom
  • 43,590
  • 17
  • 150
  • 159
0

This is the way I made it work, not very "clean", but makes the work. The function to add the legend, I took it from here.

import seaborn as sns
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.lines as mlines

def add_size_legend(ax,bins,leg1,legs,title):
    leg = ax.legend(
             handles=[
                 mlines.Line2D(
                     [],
                     [],
                     color="black",
                     lw=0,
                     marker="o",
                     markersize=np.sqrt(b),
                     label=legs[i],
                 )
                 for i, b in enumerate(bins)
             ],
             loc=4,
             title=title,
         )
    leg._legend_box.align = 'center'
    ax.add_artist(leg)
    # restore original legend
    try:
        ax.add_artist(leg1)
    except:
        True
    ax.set_axis_off()
    return ax


np.random.seed(1)
N = 100
categories = [1,2,3]
plot_data = dict(category=np.random.choice(categories, size=N), 
                 values=np.random.randn(N), 
                 marker_size=np.arange(N))

#--- Important to sort values by the "values" to be consistent with
# seaborn which also sort values in this way ---
df = pd.DataFrame(plot_data).sort_values(by='values')

fig,ax = plt.subplots(1,1)
ax = sns.swarmplot(x="category", y="values", data=df)
for i in range(len(categories)):
    collection = ax.collections[i]
    sizes = df.loc[df.category==categories[i],'marker_size'].values
    collection.set_sizes(sizes)

#--- Size Legend ----
ax1 = fig.add_subplot(111)
leg1 = ax.get_legend()
bins = np.array([2, 10, 50,100])
legs = [str(x) for x in bins]
ax1 = add_size_legend(ax1,bins,leg1,legs,title='marker_size')

Figure