Hierarchial plotting of pandas dataframe showing boxplot visualization

Question

I have a multivariate dataset that looks like this:

My goal is to generate a boxplot to visualize the distribution of values in Treat1, Treat2 , Treat3 and Treat4.

I can get to a barchart to get exactly what I want based on How to add group labels for bar charts in matplotlib?

However my requirement is for box plot to look at the distribution between mean and outliers for each Treat group. I am pasting the code again that generates the bar graph that is based on the stackoverflow code https://stackoverflow.com/users/2846871/varicus

df = df2.groupby(['Group','Category','Day ']).sum()
fig = plt.figure(figsize=(20,8))
ax = fig.add_subplot(111)
df.plot(kind='bar',stacked=False,ax=fig.gca())
labels = ['' for item in ax.get_xticklabels()]
ax.set_xticklabels(labels)
ax.set_xlabel('')
label_group_bar_table(ax, df)
fig.subplots_adjust(bottom=.1*df.index.nlevels)
plt.show()

What would be the best way to generate a Boxplot instead of each bar.

Did you see [this question](https://stackoverflow.com/questions/16592222/matplotlib-group-boxplots)? — ImportanceOfBeingErnest, Feb 12 '19 at 21:00

score 0 · Answer 1 · answered Feb 12 '19 at 21:17

May be this is what you are looking for.

# I create a dataframe similar to yours for others to give other solutions.

group = [1,1,1,1,2,2,4,4, 2, 2, 3, 2, 2, 4, 4, 5, 5, 5, 5, 5, 5, 1, 1, 1, 1, 1,1]
cateogry = [0,0,1,1,0,0,0,0,1,1,1,1,2,2,2,2,3,3,0,0,1,1,0,0,2,2,2]
Treat1 = np.random.randint(0, 100, size= len(group))
Treat2 = np.random.randint(0, 100, size= len(group))
Treat3 = np.random.randint(0, 100, size= len(group))
Treat4 = np.random.randint(0, 100, size= len(group))
df = pd.DataFrame.from_dict({'Group': group, "Category": cateogry, "Treat1": Treat1,"Treat2": Treat2,"Treat3": Treat3,"Treat4": Treat4, })

f = df.boxplot(by = ['Group','Category'],figsize = (12,8))

will result in

This graph doesnt take into account column A which is'day side by side . so that I can compare how Group values were on Day 0 compared to day 20 — Pearl, Feb 12 '19 at 22:50

score 0 · Answer 2 · answered Feb 13 '19 at 04:12

I added part of the code which does the job however I wanted that each category had it's own face color. I can't seem to distinguish between categories through graph.

fig, ax = plt.subplots(figsize = (20,8))


#Note showfliers=False is more readable, but requires a recent version iirc
bp = df_Dummy.boxplot(by = ['Group','Category','Day '],ax=ax, 
sym='',rot=90,return_type='dict',patch_artist=False)


[[item.set_linewidth(2) for item in bp[key]['boxes']] for key in bp.keys()]


[[item.set_linewidth(2) for item in bp[key]['fliers']] for key in bp.keys()]
[[item.set_linewidth(2) for item in bp[key]['medians']] for key in bp.keys()]
[[item.set_linewidth(2) for item in bp[key]['means']] for key in bp.keys()]
[[item.set_linewidth(2) for item in bp[key]['whiskers']] for key in bp.keys()]
[[item.set_linewidth(2) for item in bp[key]['caps']] for key in bp.keys()]
colors = ['pink', 'lightblue', 'lightgreen','yellow']
[[item.set_color in zip(colors) for item in bp[key]['boxes']] for key in bp.keys()]
# seems to have no effect
[[item.set_color('b') for item in bp[key]['fliers']] for key in bp.keys()]
[[item.set_color('m') for item in bp[key]['medians']] for key in bp.keys()]
[[item.set_markerfacecolor('k') for item in bp[key]['means']] for key in bp.keys()]
[[item.set_color('c') for item in bp[key]['whiskers']] for key in bp.keys()]
[[item.set_color('y') for item in bp[key]['caps']] for key in bp.keys()]

ax.margins(y=0.05)

Hierarchial plotting of pandas dataframe showing boxplot visualization

2 Answers2