I have a dataset with precomputed means and standard deviations. The values depend on three different categorical values. I would like to create two barplots to split the first categorical variable across them. The other two categorical values should be separated on the x-axis and by using different colors.
In seaborn
terms, I want to create barplots with seaborn.catplot
based on categorical x
and accustoming order
as well as hue
and col
arguments while being able to add my own custom standard deviations.
The following code gives the means of the barplots pretty straightforward:
import seaborn as sns
import matplotlib.pyplot as plt
tips = sns.load_dataset("tips")
tip_sumstats = (tips.groupby(["day", "sex", "smoker"])
.total_bill
.agg(["mean", 'sem'])
.reset_index())
sns.catplot(
data=tip_sumstats,
x="day",
order=["Sun", "Thur", "Fri", "Sat"],
y="mean",
hue="smoker",
col="sex",
kind="bar",
height=4,
)
This answer solves the problem when hue
and order
are not involved. However, in the above case, using
def errplot(x, y, yerr, **kwargs):
ax = plt.gca()
data = kwargs.pop("data")
data.plot(x=x, y=y, yerr=yerr, kind="bar", ax=ax, **kwargs)
g = sns.FacetGrid(tip_sumstats, col="sex", hue="smoker", height=4)
g.map_dataframe(errplot, "day", "mean", "sem")
I do not understand how to modify this version such that it respects the categorical order on the x-axis defined by some order
argument. Furthermore, I do not understand how to add a dodge=True
to it such that the differently colored bars appear next to each other.
This question tries to solve something similar. However, the approach is very technical and not straightforward at all. To me, it seems weird that no straightforward solution exists.