-1

I am trying to add labels to a swarm plot which is overlaid on top of a boxplot. To explain how the data is configured it would as if the data points are for individual movies, with the boxplots being grouped by actor, and the value on the Y-axis is the length of movie.

I want to plot all the individual markers for the movies, which I have done with seaborn using a swarmplot, and then I want to add labels beside the markers noting which movies they are. If this is only possible using a different package then I am happy to not use seaborn, as long as I can still have the individual data markers.

Currently the graphs are created using the below code:

sns.boxplot(x='Actor_name', y='Movie_length', data=df_clean, showfliers = True)
sns.swarmplot(x='Actor_name', y='Movie_length', data=df_clean, color=".25")

The dataframe itself can just be imagined as having three columns, with an actor_name, movie_length, and movie_title column. The data is just one movie per one row.

tareq albeesh
  • 1,701
  • 2
  • 10
  • 13
khazag169
  • 1
  • 1
  • 1
    A boxplot or a swarmplot don't keep references to the original points the data comes from. Adding a label for each movie quickly would create an unreadable mess of overlapping labels. A library such as mplcursors could add labels when hovering over points, but it would be quite a lot of fiddling to make this work with a boxplot or swarmplot. – JohanC Aug 22 '23 at 09:44
  • I am not sure if you will get a notification from this, but you posted this solution previously which is almost exactly what I am looking for: https://stackoverflow.com/questions/61734304/label-outliers-in-a-boxplot-python I just can't quite adapt the code you wrote to my dataframe is the only thing – khazag169 Aug 22 '23 at 14:58
  • 1
    Well, the other solution only marks the outliers of the boxplot. If there would be too many outliers, the plot would soon get unreadable. Anyway, you could just loop through the rows of your dataframe and use that information to place some text; the x-position would depend on the actor name plus some offset, the y-position on the movie length. – JohanC Aug 22 '23 at 16:10
  • In my actual data there isn't a huge number of markers, but labelling just the outliers and the whisker markers would be fine for my purpose. I'm not entirely sure how to do what you've suggested above, I've tried to place text by using annotations but I cant seem to get the code to get the actual movie title or to put the annotation in the right spot – khazag169 Aug 22 '23 at 16:53
  • [How to annotate swarmplot points on a categorical axis and labels from a different column](https://stackoverflow.com/q/73584561/7758804) & [Annotate points in a stripplot](https://stackoverflow.com/q/44077661/7758804) – Trenton McKinney Aug 22 '23 at 18:27

0 Answers0