0

I have data that look like:

    engages_telehealth  knockout_tox    pcc  variable    value
0                 True          True   True    health   135.50
1                 True          True   True  admitted  3443.25
2                 True          True  False    health   136.50
3                 True          True  False  admitted  3444.45
4                 True         False   True    health   115.50
5                 True         False   True  admitted  3640.80
6                 True         False  False    health   117.75
7                 True         False  False  admitted  3615.60
8                False          True   True    health   137.00
9                False          True   True  admitted  3314.90
10               False          True  False    health   136.00
11               False          True  False  admitted  3320.40
12               False         False   True    health   115.00
13               False         False   True  admitted  3334.25
14               False         False  False    health   115.00
15               False         False  False  admitted  3363.25

I want to make a hierarchically clustered boxplot similar to the below picture, which is a somewhat standard way of describing multiple conditions in biology. enter image description here

Other SO questions for hierarchical plots (this, or this) may have three layers, but those are clusters not independent conditions and my conditions are boolean not numeric.

I tried with catplot:

  print(df_graph)

fig = plt.figure()
ax=fig.add_subplot(111)

sns.catplot(data=df_graph,x='variable',y='value',col=[ck1,ck2], kind='bar', ax=ax)
plt.show() 

But, the col parameter only accepts a string and I don't want to make a FacetGrid as I don't have the space for it in the document I am preparing.

mac389
  • 3,004
  • 5
  • 38
  • 62

1 Answers1

1

This should get you pretty close. You won't see any error bars because you don't have more than a single record per category. If you do, they will show up.

import seaborn as sns
sns.set(rc={'figure.figsize':(16,9)})
sns.set_context('talk')


df = df.replace({True:'+', False:'-'})
df['cat'] = 'TELE' + df['engages_telehealth']+'\nTOX '+df['knockout_tox']+'\nPCC '+df['pcc']


sns.barplot(data=df, x='cat', y='value', hue='variable')

Outputenter image description here

Chris
  • 15,819
  • 3
  • 24
  • 37