1

I want to set the customized value on x-axis of hist graph

i have a dataframe with column A, having data ranges from 0 to 500

i wanted to draw the distributed graph with customized range, like 0-20, 20-40, 40-60, 60-80, 80-100 and 100-500

my code is look like

df['A'].plot(kind='hist', range=[0,500])

this is giving equal range, but not what i'm looking for.

ggupta
  • 675
  • 1
  • 10
  • 27
  • 1
    I think this is what you are looking for? https://stackoverflow.com/questions/6986986/bin-size-in-matplotlib-histogram – Jason Chia Dec 09 '19 at 11:22

1 Answers1

2

You can try np.select to group the data into the required groups like this.

>>> data = np.random.randint(0,500, size=15)
>>> data
array([ 44, 271, 293, 158, 479, 303,  32,  79, 314, 240,  95, 412, 150,
       356, 376])
>>> np.select([data <= 20, data <= 40, data <= 60, data <= 80, data <= 100, data <= 500], [1,2,3,4,5,6], data)
array([3, 6, 6, 6, 6, 6, 2, 4, 6, 6, 5, 6, 6, 6, 6])

So you need to add a new column to your data frame like this

>>> df = pd.DataFrame(np.random.randint(0,500,size=1000), columns = list("A"))
>>> df.head(4)
     A
0  179
1  136
2  114
3  124
>>> df["groups"] = np.select([df.A <= 20, df.A <= 40, df.A <= 60, df.A <= 80, df.A <= 100, df.A <= 500], [1,2,3,4,5,6], df.A)
>>> df.head(4)
     A  groups
0  179       6
1  136       6
2  114       6
3  124       6

Then you can plot the histogram like this.

>>> df1 = pd.DataFrame({'count' : df.groups.value_counts(sort=False), 'names' : ["0-20", "20-40", "40-60", "60-80", "80-100", "100-500"]})
>>> df1.plot.bar(x='names', y='count')
<matplotlib.axes._subplots.AxesSubplot object at 0x0000000018CD2808>
>>> plt.show()
abhilb
  • 5,639
  • 2
  • 20
  • 26