-1

I am attempting to annotate my box plots using the guide written by Trenton McKinney (bottom of page) here.

The data:

3kmdx_50dz 3kmdx_100dz 1kmdx_50dz 1kmdx_100dz
1 0 1 0
4 0 4 0
0 0 16 0
0 0 28 0
0 0 28 0
8 0 36 0
8 0 68 0
8 0 68 0
20 0 192 0
24 0 124 0
16 0 232 0
40 0 392 0
24 0 472 0
52 0 440 0
40 0 436 0
80 0 572 0

I successfully adapted the code to a single figure, but I noticed that the proportions are such that the information for the left-most box plot gets completely obscured:

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from matplotlib.cbook import boxplot_stats
%matplotlib inline

def get_df(file, sheetname):
    df = pd.read_excel(file, sheetname)
    df.drop('Unnamed: 0', axis=1, inplace=True)
    return df

# Read MS Excel file
xls = pd.ExcelFile('confidential file path - see table above')

# Create dataframes
sb_nw_ll = get_df(xls, 'SB_NW_LL')
sb_rkw_ll = get_df(xls, 'SB_RKW_LL')
sb_mc_ll = get_df(xls, 'SB_MC_LL')
mb_rkw_ll = get_df(xls, 'MB_RKW_LL')
mb_mc_ll = get_df(xls, 'MB_MC_LL')

fig, ax = plt.subplots(figsize=(10,10))

stats = boxplot_stats(sb_nw_ll)
stats = pd.DataFrame(stats, index=sb_nw_ll.columns).iloc[:, [4, 5, 7, 8, 9]]

sns.set_theme(palette='pastel')
box_plot = sns.boxplot(data=sb_nw_ll, ax=ax, palette=['m', 'g'])
sns.despine(offset=10, trim=True)

for xtick in box_plot.get_xticks():
    for col in stats.columns:
        box_plot.text(xtick, stats[col][xtick], stats[col][xtick],
                      horizontalalignment='left', size='medium', color='k', 
                      weight='semibold', bbox=dict(facecolor='lightgray'))

box plot overlap

So I figured I would put each box plot on a separate axis, so each y-axis would have it's own scaling factor such that the data would be readable. But when I attempted to adapt the prior solution to a single box plot on its own axis with this code:

# Create figure and axes
fig = plt.figure(figsize=(20, 10))
ax1 = plt.subplot(1, 4, 1)
ax2 = plt.subplot(1, 4, 2)
ax3 = plt.subplot(1, 4, 3)
ax4 = plt.subplot(1, 4, 4)

sns.set_theme(palette='pastel')

box_plot1 = sns.boxplot(data=sb_nw_ll['3kmdx_50dz'], ax=ax1, palette=['m'])
box_plot2 = sns.boxplot(data=sb_nw_ll['3kmdx_100dz'], ax=ax2, palette=['g'])
box_plot3 = sns.boxplot(data=sb_nw_ll['1kmdx_50dz'], ax=ax3, palette=['m'])
box_plot4 = sns.boxplot(data=sb_nw_ll['1kmdx_100dz'], ax=ax4, palette=['g'])

stats = sb_nw_ll['3kmdx_50dz']
stats1 = boxplot_stats(stats)
stats1 = pd.DataFrame(stats1, index=stats).iloc[:, [4, 5, 7, 8, 9]]

for xtick in box_plot1.get_xticks():
    for col in stats1.columns:
        box_plot.text(xtick, stats1[col][xtick], stats1[col][xtick],
                      horizontalalignment='left', size='medium', color='k', 
                      weight='semibold', bbox=dict(facecolor='lightgray'))

I received the following error:

error message

What exactly is going wrong here? I'm unsure of how to proceed.

Thank you for the assistance!

SAVen
  • 67
  • 8

1 Answers1

0

The reason for the error is because stats1[col][xtick] does not return a single value but 3. So, python is not able to add the text. After you create stats1, you need to reset the index using stats1.reset_index(inplace=True) (just before the start of FOR loop). This will reset the index to unique values and you will be able to add the text. Also, in the same line where you are seeing the error, note that you need to change box_plot.text(xtick, ... to box_plot1.text(xtick, .... That is just a syntax error.

Post the change, this is the plot I got, hope it is what you were expecting as well.

enter image description here

Redox
  • 9,321
  • 5
  • 9
  • 26