4

I am plotting a series histogram in Pandas as follows:

df['Primary Type'].value_counts().plot(kind='bar')

Nevertheless, this series have 25 unique values and the plot draws many bars. Is it possible to group the bars with lower frequency in only one?

Thank you in advance.

Fabian Peña
  • 73
  • 1
  • 6

2 Answers2

4

You can use pd.cut to make histogram bins -

# Example Dataframe
df = pd.DataFrame({'a' : [25, 22, 22, 21, 45, 20, 1, 1, 1, 1, 2, 3, 4, 4, 4]})

cuts = pd.cut(df['a'], [0, 10, 50])
cuts.value_counts().plot(kind='bar')

enter image description here

hashcode55
  • 5,622
  • 4
  • 27
  • 40
2

You can do it by filtering by boolean indexing:

np.random.seed(100)
df = pd.DataFrame(np.random.randint(20, size=(20,1)), columns=['Primary Type'])
print (df)
    Primary Type
0              8
1              3
2              7
3             15
4             16
5             10
6              2
7              2
8              2
9             14
10             2
11            17
12            16
13            15
14             4
15            11
16            16
17             9
18             2
19            12

s = df['Primary Type'].value_counts()
print (s)
2     5
16    3
15    2
17    1
14    1
12    1
11    1
10    1
9     1
8     1
7     1
4     1
3     1
Name: Primary Type, dtype: int64

#all values under trash sum to one category
tresh = 2
a = s[s <= tresh].sum()
s = s[s > tresh]
#then add to filtered df
s.loc['another'] = a
print (s)
2           5
16          3
another    12
Name: Primary Type, dtype: int64

#last plot
s.plot(kind='bar')

graph

jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252