-1

How do I plot a barchart similar to

Clustered bar plot in gnuplot using python matplotlib?

date|name|empid|app|subapp|hours
20140101|A|0001|IIC|I1|2.5
20140101|A|0001|IIC|I2|3
20140101|A|0001|IIC|I3|4
20140101|A|0001|CAR|C1|2.5
20140101|A|0001|CAR|C2|3
20140101|A|0001|CAR|C3|2
20140101|A|0001|CAR|C4|2

Trying to plot the subapp hours by app for the same person. Couldn't see an example in the demo pages of matplotlib.

EDIT: None of the examples cited below seem to work for unequal # of bars for each category as above.

Community
  • 1
  • 1
Sivaram
  • 313
  • 3
  • 10
  • 1
    There *is* [an example](http://matplotlib.org/examples/api/barchart_demo.html) – Ricardo Cárdenes Feb 16 '14 at 12:12
  • and [another](http://matplotlib.org/1.3.1/examples/pylab_examples/histogram_demo_extended.html) example – M4rtini Feb 16 '14 at 12:14
  • 1
    Btw, the answer to [this question](http://stackoverflow.com/questions/11597785/setting-spacing-between-grouped-bar-plots-in-matplotlib) explains a few things you may want to have into account – Ricardo Cárdenes Feb 16 '14 at 12:17
  • `subapp` values does not match between `app`s. length: 3 vs 4, names: `I*` vs `C*`. – falsetru Feb 16 '14 at 12:20
  • subapp can be any number of entries for IIC/CAR and need not match. – Sivaram Feb 16 '14 at 13:25
  • Thanks for the links, I'll try them out close out the question, if I can get it to work. – Sivaram Feb 16 '14 at 13:59
  • None of the examples work if I choose unequal # of bars for each category. Unless I'm doing something wrong. The above data has unequal number of entries for each category I* and C* – Sivaram Feb 16 '14 at 16:52
  • Pad out your categories to contain the same number of subapps? – GWW Feb 16 '14 at 17:34

1 Answers1

1

The examples didn't manage unequal # of bars but you can use another approach. I'll post you an example.

Note: I use pandas to manipulate your data, if you don't know about it you should give it a try http://pandas.pydata.org/:

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
import numpy as np

df = pd.read_table("data.csv",sep="|")
grouped = df.groupby('app')['hours']

colors = "rgbcmyk"

fig, ax = plt.subplots()
initial_gap = 0.1
start = initial_gap
width = 1.0
gap = 0.05
for app,group in grouped:
    size = group.shape[0]
    ind = np.linspace(start,start + width, size+1)[:-1]   
    w = (ind[1]-ind[0])
    start = start + width + gap
    plt.bar(ind,group,w,color=list(colors[:size]))

tick_loc = (np.arange(len(grouped)) * (width+gap)) + initial_gap + width/2
ax.set_xticklabels([app for app,_ in grouped])
ax.xaxis.set_major_locator(mtick.FixedLocator(tick_loc))

plt.show()

And on data.csv is the data:

date|name|empid|app|subapp|hours
20140101|A|0001|IIC|I1|2.5
20140101|A|0001|IIC|I2|3
20140101|A|0001|IIC|I3|4
20140101|A|0001|CAR|C1|2.5
20140101|A|0001|CAR|C2|3
20140101|A|0001|CAR|C3|2
20140101|A|0001|CAR|C4|2
Alvaro Fuentes
  • 16,937
  • 4
  • 56
  • 68