13

I'm looking for a way to plot multiple bars per value in matplotlib. For numerical data, this can be achieved be adding an offset to the X data, as described for example here:

import numpy as np
import matplotlib.pyplot as plt

X = np.array([1,3,5])
Y = [1,2,3]
Z = [2,3,4]

plt.bar(X - 0.4, Y) # offset of -0.4
plt.bar(X + 0.4, Z) # offset of  0.4
plt.show()

Multiple bars for numerical data

plt.bar() (and ax.bar()) also handle categorical data automatically:

X = ['A','B','C']
Y = [1,2,3]

plt.bar(X, Y)
plt.show()

Category handling

Here, it is obviously not possible to add an offset, as the categories are not directly associated with a value on the axis. I can manually assign numerical values to the categories and set labels on the x axis with plt.xticks():,

X = ['A','B','C']
Y = [1,2,3]
Z = [2,3,4]
_X = np.arange(len(X))

plt.bar(_X - 0.2, Y, 0.4)
plt.bar(_X + 0.2, Z, 0.4)
plt.xticks(_X, X) # set labels manually
plt.show()

Manually setting category labels

However, I'm wondering if there is a more elegant way that makes use of the automatic category handling of bar(), especially if the number of categories and bars per category is not known in before (this causes some fiddling with the bar widths to avoid overlaps).

tsabsch
  • 2,131
  • 1
  • 20
  • 28

1 Answers1

31

There is no automatic support of subcategories in matplotlib.

Placing bars with matplotlib

You may go the way of placing the bars numerically, like you propose yourself in the question. You can of course let the code manage the unknown number of subcategories.

import numpy as np
import matplotlib.pyplot as plt

X = ['A','B','C']
Y = [1,2,3]
Z = [2,3,4]

def subcategorybar(X, vals, width=0.8):
    n = len(vals)
    _X = np.arange(len(X))
    for i in range(n):
        plt.bar(_X - width/2. + i/float(n)*width, vals[i], 
                width=width/float(n), align="edge")   
    plt.xticks(_X, X)
    
subcategorybar(X, [Y,Z,Y])

plt.show()

enter image description here

Using pandas

You may also use pandas plotting wrapper, which does the work of figuring out the number of subcategories. It will plot one group per column of a dataframe.

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

X = ['A','B','C']
Y = [1,2,3]
Z = [2,3,4]

df = pd.DataFrame(np.c_[Y,Z,Y], index=X)
df.plot.bar()

plt.show()

enter image description here

Community
  • 1
  • 1
ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712
  • Thank you for your answer and code sample. I am aware of pandas capability to handle subcategories (in fact, my data is already stored in a dataframe), but for sake of consistency with other plots I was looking for a solution with matplotlib. – tsabsch Jan 09 '18 at 08:56
  • 3
    In order to rotate "A","B","C" on X axis, use df.plot.bar(rot=0) – Yair Daon Aug 14 '18 at 10:43
  • the above answer, when using pandas, can also be simplified to a one-liner: `df.set_index(X).plot.bar();` (this way, one doesn't lose the categorical names of the bars) – Kevad Jul 23 '19 at 16:20
  • @Kevad Your comment is confusing, since `X` already ***is*** the index, so no need to set it again. – ImportanceOfBeingErnest Jul 23 '19 at 16:22
  • @ImportanceOfBeingErnest thanks for the correction. What I meant is that `df.plot.bar();` also produces the same output :) – Kevad Jul 24 '19 at 09:17
  • @Kevad Yes, `df.plot.bar()` is what this answer proposes. Am I missing something? If not, we can probably delete all those comments? – ImportanceOfBeingErnest Jul 24 '19 at 11:05