I'm new to Matplotlib / Python, and am trying to make a grouped boxplot very similar to Joe Kington's excellent example shown here:
how to make a grouped boxplot graph in matplotlib
I'd like to modify Joe's example for my own requirements.
For my demo data below, I have 5 individuals who each have 4 attempts ( = "attempts": '1st','2nd','3rd','4th') at each of 3 different tasks (= "tasks": 'A','B','C').
I'd like to be able to:
1) input my data as a series of 2D numpy arrays, one array per task as shown, which are each composed of the scores of the 5 individuals nested within the 4 sequential attempts.
2) label both the tasks and attempts on the shared x-axis of the plot using strings, saved as sequential items in the lists "tasklist" and "attemptlist" respectively.
3) generalise the solution to make the appropriate plots for any number of individuals, and any number of tasks, each requiring any number of repeated attempts.
Edit: 2 Apr 2015:
The only problem outstanding is the seemingly counter-intuitive way that Python lists assemble themselves into a non-sequential order when using the .keys() method; hence my tasklist keeps coming out as "A,C,B" rather than "A,B,C". The workaround is to import and create an Ordered Dictionary. This is all new to me, but this would seem to require the item names in my tasklist to be declared twice as Joe did in his example - once to associate the tasks with the corresponding data matrices, and once to associate the item names in the Ordered Dictionary with the corresponding sequential numeric keys...
Was wondering: is there a method (akin to the .keys() method for regular dictionaries) which would iterate over my data matrices to create an Ordered Dictionary in the order shown ("A,B,C"), without requiring me to enter details of my tasklist twice?
Many thanks
Dave
import matplotlib.pyplot as plt
import numpy as np
data = {}
data ['A'] = np.array([[1,2,3,4,9],[2,3,4,4,4],[3,4,4,5,5],[5,6,6,7,7,7]])
data ['B'] = np.array([[2,3,4,4,5],[3,4,5,6,10],[4,5,6,6,7],[5,6,7,7,8]])
data ['C'] = np.array([[4,5,6,6,10],[6,7,8,8,8],[7,8,9,9,10],[2,10,11,11,12]])
tasklist = data.keys() # list of labels for tasks 'A' to 'C' (each containing 4 attempts labelled '1st' to '4th')
attemptlist = ['1st','2nd','3rd','4th'] # list of labels for attempts 1 to 4 within each task
fig, axes = plt.subplots(ncols= len(tasklist), sharey=True)
fig.subplots_adjust(wspace=0)
for ax,task in zip(axes,tasklist):
ax.boxplot([data[task][attemptlist.index(attempt)] for attempt in attemptlist],showfliers=False)
ax.set(xticklabels=attemptlist, xlabel=task)
plt.show()