I am attempting to annotate my box plots using the guide written by Trenton McKinney (bottom of page) here.
The data:
3kmdx_50dz | 3kmdx_100dz | 1kmdx_50dz | 1kmdx_100dz |
---|---|---|---|
1 | 0 | 1 | 0 |
4 | 0 | 4 | 0 |
0 | 0 | 16 | 0 |
0 | 0 | 28 | 0 |
0 | 0 | 28 | 0 |
8 | 0 | 36 | 0 |
8 | 0 | 68 | 0 |
8 | 0 | 68 | 0 |
20 | 0 | 192 | 0 |
24 | 0 | 124 | 0 |
16 | 0 | 232 | 0 |
40 | 0 | 392 | 0 |
24 | 0 | 472 | 0 |
52 | 0 | 440 | 0 |
40 | 0 | 436 | 0 |
80 | 0 | 572 | 0 |
I successfully adapted the code to a single figure, but I noticed that the proportions are such that the information for the left-most box plot gets completely obscured:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from matplotlib.cbook import boxplot_stats
%matplotlib inline
def get_df(file, sheetname):
df = pd.read_excel(file, sheetname)
df.drop('Unnamed: 0', axis=1, inplace=True)
return df
# Read MS Excel file
xls = pd.ExcelFile('confidential file path - see table above')
# Create dataframes
sb_nw_ll = get_df(xls, 'SB_NW_LL')
sb_rkw_ll = get_df(xls, 'SB_RKW_LL')
sb_mc_ll = get_df(xls, 'SB_MC_LL')
mb_rkw_ll = get_df(xls, 'MB_RKW_LL')
mb_mc_ll = get_df(xls, 'MB_MC_LL')
fig, ax = plt.subplots(figsize=(10,10))
stats = boxplot_stats(sb_nw_ll)
stats = pd.DataFrame(stats, index=sb_nw_ll.columns).iloc[:, [4, 5, 7, 8, 9]]
sns.set_theme(palette='pastel')
box_plot = sns.boxplot(data=sb_nw_ll, ax=ax, palette=['m', 'g'])
sns.despine(offset=10, trim=True)
for xtick in box_plot.get_xticks():
for col in stats.columns:
box_plot.text(xtick, stats[col][xtick], stats[col][xtick],
horizontalalignment='left', size='medium', color='k',
weight='semibold', bbox=dict(facecolor='lightgray'))
So I figured I would put each box plot on a separate axis, so each y-axis would have it's own scaling factor such that the data would be readable. But when I attempted to adapt the prior solution to a single box plot on its own axis with this code:
# Create figure and axes
fig = plt.figure(figsize=(20, 10))
ax1 = plt.subplot(1, 4, 1)
ax2 = plt.subplot(1, 4, 2)
ax3 = plt.subplot(1, 4, 3)
ax4 = plt.subplot(1, 4, 4)
sns.set_theme(palette='pastel')
box_plot1 = sns.boxplot(data=sb_nw_ll['3kmdx_50dz'], ax=ax1, palette=['m'])
box_plot2 = sns.boxplot(data=sb_nw_ll['3kmdx_100dz'], ax=ax2, palette=['g'])
box_plot3 = sns.boxplot(data=sb_nw_ll['1kmdx_50dz'], ax=ax3, palette=['m'])
box_plot4 = sns.boxplot(data=sb_nw_ll['1kmdx_100dz'], ax=ax4, palette=['g'])
stats = sb_nw_ll['3kmdx_50dz']
stats1 = boxplot_stats(stats)
stats1 = pd.DataFrame(stats1, index=stats).iloc[:, [4, 5, 7, 8, 9]]
for xtick in box_plot1.get_xticks():
for col in stats1.columns:
box_plot.text(xtick, stats1[col][xtick], stats1[col][xtick],
horizontalalignment='left', size='medium', color='k',
weight='semibold', bbox=dict(facecolor='lightgray'))
I received the following error:
What exactly is going wrong here? I'm unsure of how to proceed.
Thank you for the assistance!