Clustered barchart in matplotlib?

Question

How do I plot a barchart similar to

Clustered bar plot in gnuplot using python matplotlib?

date|name|empid|app|subapp|hours
20140101|A|0001|IIC|I1|2.5
20140101|A|0001|IIC|I2|3
20140101|A|0001|IIC|I3|4
20140101|A|0001|CAR|C1|2.5
20140101|A|0001|CAR|C2|3
20140101|A|0001|CAR|C3|2
20140101|A|0001|CAR|C4|2

Trying to plot the subapp hours by app for the same person. Couldn't see an example in the demo pages of matplotlib.

EDIT: None of the examples cited below seem to work for unequal # of bars for each category as above.

There *is* [an example](http://matplotlib.org/examples/api/barchart_demo.html) — Ricardo Cárdenes, Feb 16 '14 at 12:12
and [another](http://matplotlib.org/1.3.1/examples/pylab_examples/histogram_demo_extended.html) example — M4rtini, Feb 16 '14 at 12:14
Btw, the answer to [this question](http://stackoverflow.com/questions/11597785/setting-spacing-between-grouped-bar-plots-in-matplotlib) explains a few things you may want to have into account — Ricardo Cárdenes, Feb 16 '14 at 12:17
`subapp` values does not match between `app`s. length: 3 vs 4, names: `I*` vs `C*`. — falsetru, Feb 16 '14 at 12:20
subapp can be any number of entries for IIC/CAR and need not match. — Sivaram, Feb 16 '14 at 13:25
Thanks for the links, I'll try them out close out the question, if I can get it to work. — Sivaram, Feb 16 '14 at 13:59
None of the examples work if I choose unequal # of bars for each category. Unless I'm doing something wrong. The above data has unequal number of entries for each category I* and C* — Sivaram, Feb 16 '14 at 16:52
Pad out your categories to contain the same number of subapps? — GWW, Feb 16 '14 at 17:34

score 1 · Answer 1 · answered Feb 17 '14 at 16:41

The examples didn't manage unequal # of bars but you can use another approach. I'll post you an example.

Note: I use pandas to manipulate your data, if you don't know about it you should give it a try http://pandas.pydata.org/:

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
import numpy as np

df = pd.read_table("data.csv",sep="|")
grouped = df.groupby('app')['hours']

colors = "rgbcmyk"

fig, ax = plt.subplots()
initial_gap = 0.1
start = initial_gap
width = 1.0
gap = 0.05
for app,group in grouped:
    size = group.shape[0]
    ind = np.linspace(start,start + width, size+1)[:-1]   
    w = (ind[1]-ind[0])
    start = start + width + gap
    plt.bar(ind,group,w,color=list(colors[:size]))

tick_loc = (np.arange(len(grouped)) * (width+gap)) + initial_gap + width/2
ax.set_xticklabels([app for app,_ in grouped])
ax.xaxis.set_major_locator(mtick.FixedLocator(tick_loc))

plt.show()

And on data.csv is the data:

date|name|empid|app|subapp|hours
20140101|A|0001|IIC|I1|2.5
20140101|A|0001|IIC|I2|3
20140101|A|0001|IIC|I3|4
20140101|A|0001|CAR|C1|2.5
20140101|A|0001|CAR|C2|3
20140101|A|0001|CAR|C3|2
20140101|A|0001|CAR|C4|2

Clustered barchart in matplotlib?

1 Answers1