Plot multiple bars for categorical data

Question

I'm looking for a way to plot multiple bars per value in matplotlib. For numerical data, this can be achieved be adding an offset to the X data, as described for example here:

import numpy as np
import matplotlib.pyplot as plt

X = np.array([1,3,5])
Y = [1,2,3]
Z = [2,3,4]

plt.bar(X - 0.4, Y) # offset of -0.4
plt.bar(X + 0.4, Z) # offset of  0.4
plt.show()

plt.bar() (and ax.bar()) also handle categorical data automatically:

X = ['A','B','C']
Y = [1,2,3]

plt.bar(X, Y)
plt.show()

Here, it is obviously not possible to add an offset, as the categories are not directly associated with a value on the axis. I can manually assign numerical values to the categories and set labels on the x axis with plt.xticks():,

X = ['A','B','C']
Y = [1,2,3]
Z = [2,3,4]
_X = np.arange(len(X))

plt.bar(_X - 0.2, Y, 0.4)
plt.bar(_X + 0.2, Z, 0.4)
plt.xticks(_X, X) # set labels manually
plt.show()

However, I'm wondering if there is a more elegant way that makes use of the automatic category handling of bar(), especially if the number of categories and bars per category is not known in before (this causes some fiddling with the bar widths to avoid overlaps).

score 31 · Accepted Answer · edited Jun 20 '20 at 09:12

31

There is no automatic support of subcategories in matplotlib.

Placing bars with matplotlib

You may go the way of placing the bars numerically, like you propose yourself in the question. You can of course let the code manage the unknown number of subcategories.

import numpy as np
import matplotlib.pyplot as plt

X = ['A','B','C']
Y = [1,2,3]
Z = [2,3,4]

def subcategorybar(X, vals, width=0.8):
    n = len(vals)
    _X = np.arange(len(X))
    for i in range(n):
        plt.bar(_X - width/2. + i/float(n)*width, vals[i], 
                width=width/float(n), align="edge")   
    plt.xticks(_X, X)
    
subcategorybar(X, [Y,Z,Y])

plt.show()

Using pandas

You may also use pandas plotting wrapper, which does the work of figuring out the number of subcategories. It will plot one group per column of a dataframe.

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

X = ['A','B','C']
Y = [1,2,3]
Z = [2,3,4]

df = pd.DataFrame(np.c_[Y,Z,Y], index=X)
df.plot.bar()

plt.show()

edited Jun 20 '20 at 09:12

Community

1
1

answered Jan 08 '18 at 21:40

ImportanceOfBeingErnest

321,279
53
665
712

Thank you for your answer and code sample. I am aware of pandas capability to handle subcategories (in fact, my data is already stored in a dataframe), but for sake of consistency with other plots I was looking for a solution with matplotlib. – tsabsch Jan 09 '18 at 08:56
3

In order to rotate "A","B","C" on X axis, use df.plot.bar(rot=0) – Yair Daon Aug 14 '18 at 10:43
the above answer, when using pandas, can also be simplified to a one-liner: `df.set_index(X).plot.bar();` (this way, one doesn't lose the categorical names of the bars) – Kevad Jul 23 '19 at 16:20
@Kevad Your comment is confusing, since `X` already ***is*** the index, so no need to set it again. – ImportanceOfBeingErnest Jul 23 '19 at 16:22
@ImportanceOfBeingErnest thanks for the correction. What I meant is that `df.plot.bar();` also produces the same output :) – Kevad Jul 24 '19 at 09:17
@Kevad Yes, `df.plot.bar()` is what this answer proposes. Am I missing something? If not, we can probably delete all those comments? – ImportanceOfBeingErnest Jul 24 '19 at 11:05

Plot multiple bars for categorical data

1 Answers1

Placing bars with matplotlib

Using pandas