How to plot a two column pandas dataframe's elements as an histogram?

Question

I have the following pandas dataframe:

    A                    B

    1                    3
    0                    2
    1                    4
    0                    1
    0                    3

I would like to plot the frequency of B instnaces given A, something like this:

     |
     |
     |        __
 B   |       |  |
     |  ___  |  |
     |  | |  |  |
     |  | |  |  |
     |__|_|__|__|______________
                A

Thus, I tried the following:

df2.groupby([df.A, df.B]).count().plot(kind="bar")

However, I am getting the following exception:

TypeError: Empty 'DataFrame': no numeric data to plot

Therefore, my question is how to plot the frequency of the elements in B given the frequency of A?.

@john_doe: I saw you marked my answer correct then removed it. Any reason why you did so? — Sreejith Menon, Aug 03 '16 at 05:05
I'm still quite confused about what you want... Do you just want to plot A as you x axis and B as y, regardless of their values, or do you want something that looks like your ASCII graph ? Meaning something like B x A ? — 3kt, Aug 03 '16 at 05:23
Thanks for the help guys, I really apreciate your time. I was expecting to do Sreejith Menon's approach. — john doe, Aug 03 '16 at 05:32

Joe T. Boka · Accepted Answer · 2016-08-03T05:31:59.153

3

Sounds like this is what you want: You can use Series.value_counts()

print(df['B'].value_counts().plot(kind='bar'))

If you don't want the value_count sorted, you can do this:

print(df['B'].value_counts(sort=False).plot(kind='bar'))

edited Aug 03 '16 at 05:31

answered Aug 03 '16 at 05:19

Joe T. Boka

6,554
6
29
48

3kt · Answer 2 · 2016-08-03T05:21:02.163

2

I'm not entirely sure what you mean by "plot the frequency of the elements in B given the frequency of A", but this gives the expected output :

In [4]: df
Out[4]: 
      A  B
3995  1  3
3996  0  2
3997  1  4
3998  0  1
3999  0  3

In [8]: df['data'] = df['A']*df['B']

In [9]: df
Out[9]: 
      A  B  data
3995  1  3     3
3996  0  2     0
3997  1  4     4
3998  0  1     0
3999  0  3     0

In [10]: df[['A','data']].plot(kind='bar', x='A', y='data')
Out[10]: <matplotlib.axes._subplots.AxesSubplot at 0x7fde7eebb9e8>

In [11]: plt.show()

edited Aug 03 '16 at 05:21

answered Aug 03 '16 at 04:43

3kt

2,543
1
17
29

Thanks for the help, I guess that I confused you since 3995 , 3996, 3997, 3998, 3999 is just the index number, I am interested in ploting just A and B column, I edited the question. – john doe Aug 03 '16 at 04:53
@johndoe still, what's the difference between what you expect and the screenshot in my answer ? Do you only want to have 0s and 1s in place of the 3995..3999 indexes ? – 3kt Aug 03 '16 at 04:57
Yes, I would like to plot as the x axis the values of column A and as a y axis the values of column B. Thanks for the help – john doe Aug 03 '16 at 05:02

alphiii · Answer 3 · 2016-08-03T05:33:40.337

2

Here is my way:

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame([[1,3],[0,2],[1,4],[0,1],[0,3]])
df.columns = ['A', 'B']
x = df.loc[:,'A'].values
y = df.loc[:,'B'].values
plt.bar(x, y, label = 'Bar', align='center',)
plt.xticks(x)
plt.show()

edited Aug 03 '16 at 05:33

answered Aug 03 '16 at 05:09

alphiii

1,597
3
21
27

Sreejith Menon · Answer 4 · 2016-08-03T05:18:34.860

I believe if you are trying to plot the frequency of occurrence of values in column b, this might help.

from collections import Counter
vals = list(df['b'])
cntr = Counter(vals)
# Out[30]: Counter({1: 1, 2: 1, 3: 2, 4: 1})

vals = [(key,cntr[key]) for key in cntr]
x = [tup[0] for tup in vals]
y = [tup[1] for tup in vals]

plt.bar(x,y,label='Bar1',color='red')
plt.show()

Another way using histogram from matplotlib. First declare a bins array, which are basically buckets into which your values will go into.

import matplotlib.pyplot as plt
import pandas as pd

l = [(1,3),(0,2),(1,4),(0,1),(0,3)]
df = pd.DataFrame(l)

df.columns = ['a','b']
bins = [1,2,3,4,5] #ranges of data
plt.hist(list(df['b']),bins,histtype='bar',rwidth=0.8)

Thanks for the help, this is actually what I was expected – john doe Aug 03 '16 at 05:04 — john doe, Aug 03 '16 at 05:04

How to plot a two column pandas dataframe's elements as an histogram?

4 Answers4