0

Here is my example:

import matplotlib.pyplot as plt
test_list = ['a', 'b', 'b', 'c']
plt.hist(test_list)    
plt.show()

It generates the following error message:

TypeError                                 Traceback (most recent call last)
<ipython-input-48-228f7f5e9d1e> in <module>()
      1 test_list = ['a', 'b', 'b', 'c']
----> 2 plt.hist(test_list)
      3 plt.show()

C:\Anaconda3\lib\site-packages\matplotlib\pyplot.py in hist(x, bins, range, normed, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label, stacked, hold, data, **kwargs)
   2956                       histtype=histtype, align=align, orientation=orientation,
   2957                       rwidth=rwidth, log=log, color=color, label=label,
-> 2958                       stacked=stacked, data=data, **kwargs)
   2959     finally:
   2960         ax.hold(washold)

C:\Anaconda3\lib\site-packages\matplotlib\__init__.py in inner(ax, *args, **kwargs)
   1809                     warnings.warn(msg % (label_namer, func.__name__),
   1810                                   RuntimeWarning, stacklevel=2)
-> 1811             return func(ax, *args, **kwargs)
   1812         pre_doc = inner.__doc__
   1813         if pre_doc is None:

C:\Anaconda3\lib\site-packages\matplotlib\axes\_axes.py in hist(self, x, bins, range, normed, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label, stacked, **kwargs)
   5993             xmax = -np.inf
   5994             for xi in x:
-> 5995                 if len(xi) > 0:
   5996                     xmin = min(xmin, xi.min())
   5997                     xmax = max(xmax, xi.max())

TypeError: len() of unsized object

I only briefly search on google, but it looks like I cannot plot histogram for categorical variables in matplotlib.

Can anybody confirm?

user1700890
  • 7,144
  • 18
  • 87
  • 183
  • 1
    Possible duplicate of [Making a histogram of string values in python](http://stackoverflow.com/questions/13156657/making-a-histogram-of-string-values-in-python) – Nickil Maveli Oct 26 '16 at 15:46
  • @NickilMaveli, your link suggests work around. I thank you for it, but question remains unanswered. Is it or is not possible to plot histogram with categorical data in matplotlib? – user1700890 Oct 26 '16 at 16:53

2 Answers2

2

Sure it is possible to create a histogram of categorial data in matplotlib. As this link suggests, but also as suggested in the matplotlib demo, simply use a barchart for that purpose.

import matplotlib.pyplot as plt
import numpy as np

test_list = ['a', 'b', 'b', 'c', "d", "b"]
histdic = {x: test_list.count(x) for x in test_list}
x = []; y=[]
for key, value in histdic.iteritems():
    x.append(key)
    y.append(value)

plt.figure()
barwidth= 0.8
plt.bar(np.arange(len(y)),y, barwidth, color='r')
plt.gca().set_xticks(np.arange(len(y))+barwidth/2.)
plt.gca().set_xticklabels(x)
plt.show()

enter image description here

Since this is a histogram and it is created with matplotlib, it's definitely wrong to say that you "cannot plot histogram for categorical variables in matplotlib".

Community
  • 1
  • 1
ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712
1

try to use the below code:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

cats = np.array([l for l in "ABCD"], dtype=str)
cats = np.random.ch`enter code here`oice(cats, 100, p=[0.3, 0.1, 0.4, 0.2])

res = np.random.choice(np.arange(1,7), 100, p=[0.2, 0.1, 0.08, 0.16,0.26,0.2])
df = pd.DataFrame({"Category":cats, "Result":res})
df2 = df.groupby(["Category", "Result"]).size().reset_index(name='Count')


df3 = pd.pivot_table(df2,  values='Count',  columns=['Result'],  index = "Category",
                         aggfunc=np.sum,  fill_value=0)
df4 = pd.pivot_table(df2,  values='Count',  columns=['Category'],  index = "Result",
                         aggfunc=np.sum,  fill_value=0)

fig, ax = plt.subplots(1,2, figsize=(10,4))
df3.plot(kind="bar", ax=ax[0])
df4.plot(kind="bar", ax=ax[1]) 

plt.show()
wasim
  • 11
  • 1