0

I'm trying to combine two Seaborn violin plots into a single one and display relations between three different features. I'm working on the tips dataset:

     total_bill   tip     sex smoker   day    time  size
0         16.99  1.01  Female     No   Sun  Dinner     2
1         10.34  1.66    Male     No   Sun  Dinner     3
2         21.01  3.50    Male     No   Sun  Dinner     3
3         23.68  3.31    Male     No   Sun  Dinner     2
4         24.59  3.61  Female     No   Sun  Dinner     4
..          ...   ...     ...    ...   ...     ...   ...
239       29.03  5.92    Male     No   Sat  Dinner     3
240       27.18  2.00  Female    Yes   Sat  Dinner     2
241       22.67  2.00    Male    Yes   Sat  Dinner     2
242       17.82  1.75    Male     No   Sat  Dinner     2
243       18.78  3.00  Female     No  Thur  Dinner     2

For this data set, I'd like to compare total_bill for different week days depending on sex and smoker columns using the split option. The graphs I'd like to combine are produced by the code below:

import seaborn as sns
import matplotlib.pyplot as plt

tips = sns.load_dataset("tips")
ax = sns.violinplot(x="day", y="total_bill", hue="smoker", data=tips, palette="muted", split=False)
ax = sns.violinplot(x="day", y="total_bill", hue="sex", data=tips, palette="muted", split=True)

enter code here enter image description here

Is it possible to create a single graph where different violins represent the total_bill distribution for smokers and non-smokers (as in the first graph), but each of violin is also split to represent differences between men and women? I'd still like to have 8 non-overlapping violins (2 per day - smokers and non smokers), but each will be further split between male and female.

I've found this thread, but the answer creates a separate violin for each combination which is not my goal.

harnen
  • 403
  • 4
  • 12
  • The easiest and clearest option `sns.catplot(kind='violin', data=tips, x='day', y='total_bill', hue='smoker', row='sex')` – Trenton McKinney Jun 16 '22 at 18:13
  • or with `split=True`: `sns.catplot(kind='violin', data=tips, x='day', y='total_bill', hue='smoker', row='sex', split=True)` – Trenton McKinney Jun 16 '22 at 18:17
  • Right, but it still produces two graphs (even if they're concatenated into a single file). – harnen Jun 16 '22 at 19:12
  • 1
    See this [answer](https://stackoverflow.com/a/72651234/7758804) – Trenton McKinney Jun 16 '22 at 19:54
  • @TrentonMcKinney that's exactly what I want! But is it possible to use different colors for males and females? – harnen Jun 30 '22 at 15:38
  • 1
    Add `sns.violinplot(..., palette=['red', 'blue'])` with colors of choice for different hue, but there is no reason to have separate colors for the Female and Male, because that information is already encoded on the x-axis. If you wanted to change the color of every other violin plot you would have to iterate through and change the facecolor. – Trenton McKinney Jun 30 '22 at 15:53
  • It's already encoded but it's important to me to make it clear using the colors as well. Thanks a lot for the discussion, that's very useful. I'll post back as soon as make it work. – harnen Jul 01 '22 at 08:29

1 Answers1

1

I believe this is what you are looking for

import seaborn as sns
import matplotlib.pyplot as plt
from matplotlib.lines import Line2D

# Load the dataset
tips = sns.load_dataset("tips")

# Configure the coloring
colors = {"Male": {"Yes": "orange", "No": "blue"}, "Female": {"Yes": "red", "No": "green"}}

# create figure and axes
fig, ax = plt.subplots()

# draw violins for each sex
sex_types = set(tips.sex)
for sex in sex_types:
    sns.violinplot(
        x="day", 
        y="total_bill", 
        hue="smoker",
        data=tips[tips.sex == sex],
        palette=colors[sex],
        split=True,
        ax=ax,
        scale="count",
        scale_hue=False,
        saturation=0.75,
        inner=None
    )

# Set transparancy for all violins
for violin in ax.collections:
    violin.set_alpha(0.25)

# Compose a custom legend
custom_lines = [
    Line2D([0], [0], color=colors[sex][smoker], lw=4, alpha=0.25) 
    for smoker in ["Yes", "No"] 
    for sex in sex_types
]
ax.legend(
    custom_lines, 
    [f"{sex} : {smoker}" for smoker in ["Yes", "No"] for sex in sex_types], 
    title="Gender : Smoker"
)

enter image description here

Smaurya
  • 167
  • 9
  • Thanks, almost! I'd still like to have 8 non-overlapping violins (2 per day - smokers and non smokers), but each will be further split between male and female. – harnen Jun 16 '22 at 14:43