Obtain standard error of mean as computed in seaborn.catplot

Question

I am using seaborn.catplot with kind='point' to plot my data. I would like to calculate the standard error of the mean (SEM) for each hue var and each category using the same method as seaborn in order to make sure that my computed values exactly match the plotted error bars. The default solution for calculating the SEM and the 95%-confidence intervals (CIs) contains a bootstrapping algorithm, where the mean is bootstrapped 1000 times in order to calculate the SEM/CIs. In an earlier post, I saw a method that might offer functions for that (using seaborn source code functions like seaborn.utils.ci() and seaborn.algorithms.bootstrap()) but I am not sure how to implement it. Since the bootstrapping uses random sampling it would also be necessary to make sure that the same array of 1000 means is produced both for plotting and for obtaining the SEM.

Here is a code example:

import numpy as np
import pandas as pd
import seaborn as sns

# simulate data
rng = np.random.RandomState(42)
measure_names = np.tile(np.repeat(['Train BAC','Test BAC'],10),2)
model_numbers = np.repeat([0,1],20)
measure_values = np.concatenate((rng.uniform(low=0.6,high=1,size=20),
                                rng.uniform(low=0.5,high=0.8,size=20)
                                ))
folds=np.tile([1,2,3,4,5,6,7,8,9,10],4)

plot_df = pd.DataFrame({'model_number':model_numbers,
                        'measure_name':measure_names,
                        'measure_value':measure_values,
                        'outer_fold':folds})

# plot data as pointplot
g = sns.catplot(x='model_number',
                y='measure_value',
                hue='measure_name',
                kind='point',
                seed=rng,
                data=plot_df)

which produces:

I would like to obtain the SEM for all train and test scores for both models. That is:

# obtain SEM for each score in each model using the same method as in sns.catplot
model_0_train_bac = plot_df.loc[((plot_df['model_number'] == 0) & (plot_df['measure_name'] == 'Train BAC')),'measure_value']
model_0_test_bac = plot_df.loc[((plot_df['model_number'] == 0) & (plot_df['measure_name'] == 'Test BAC')),'measure_value']
model_1_train_bac = plot_df.loc[((plot_df['model_number'] == 1) & (plot_df['measure_name'] == 'Train BAC')),'measure_value']
model_1_test_bac = plot_df.loc[((plot_df['model_number'] == 1) & (plot_df['measure_name'] == 'Test BAC')),'measure_value']

score 0 · Accepted Answer · answered Mar 12 '20 at 15:37

I'm not sure I get the requirement that you have the exact same samples taken. By definition, bootstrapping works by taking a random sample and therefore there will be a bit of variability from one run to the next (unless I'm mistaken).

You could calculate the CI the same way seaborn does like so:

# simulate data
rng = np.random.RandomState(42)
measure_names = np.tile(np.repeat(['Train BAC','Test BAC'],10),2)
model_numbers = np.repeat([0,1],20)
measure_values = np.concatenate((rng.uniform(low=0.6,high=1,size=20),
                                rng.uniform(low=0.5,high=0.8,size=20)
                                ))
folds=np.tile([1,2,3,4,5,6,7,8,9,10],4)

plot_df = pd.DataFrame({'model_number':model_numbers,
                        'measure_name':measure_names,
                        'measure_value':measure_values,
                        'outer_fold':folds})

x_col = 'model_number'
y_col = 'measure_value'
hue_col = 'measure_name'
ci = 95
est = np.mean
n_boot = 1000

for gr,temp_df in plot_df.groupby([hue_col,x_col]):
    print(gr,est(temp_df[y_col]), sns.utils.ci(sns.algorithms.bootstrap(temp_df[y_col], func=est,
                                          n_boot=n_boot,
                                          units=None,
                                          seed=rng)))

which ouputs:

('Test BAC', 0) 0.7581071363371585 [0.69217109 0.8316217 ]
('Test BAC', 1) 0.6527812067134964 [0.59523784 0.71539669]
('Train BAC', 0) 0.8080546943810699 [0.73214414 0.88102816]
('Train BAC', 1) 0.6201161718490218 [0.57978654 0.66241543]

Notice that If you run the loop a second time, you'd get CIs that are similar, but not exactly the same.

If you really want to get the exact values that were used in the plot by seaborn (note that, again, those values will differ slightly if you plot the same data a second time), then you could extract the values directly from the Line2D artists used to draw the error-bars:

g = sns.catplot(x=x_col,
                y=y_col,
                hue=hue_col,
                kind='point',
                ci=ci,
                estimator=est,
                n_boot=n_boot,
                seed=rng,
                data=plot_df)
for l in g.ax.lines:
    print(l.get_data())

output:

(array([0., 1.]), array([0.80805469, 0.62011617]))
(array([0., 0.]), array([0.73203808, 0.88129836])) # <<<<
(array([1., 1.]), array([0.57828366, 0.66300033])) # <<<<
(array([0., 1.]), array([0.75810714, 0.65278121]))
(array([0., 0.]), array([0.69124145, 0.83297914])) # <<<<
(array([1., 1.]), array([0.59113739, 0.71572469])) # <<<<

Obtain standard error of mean as computed in seaborn.catplot

1 Answers1