Defining bin width/x-axis scale in Matplotlib histogram

Question

I am generating histograms with matplotlib.

I need the bins to be of unequal width as I'm mostly interested in the lowest bins. Right now I'm doing this:

plt.hist(hits_array, bins = (range(0,50,10) + range(50,550,50)))

This creates what I want (the first 5 bins have a width of 10, the rest of 50), but the first five bins are, of course, narrower than the latter ones, as all bins are displayed on the same axis.

Is there a way to influence the x-axis or histogram itself so I can break the scale after the first 5 bins, so all bins are displayed as equally wide?

(I realize that this will create a distorted view, and I'm fine with that, though I wouldn't mind a bit of space between the two differently scaled parts of the axis.)

Any help will be greatly appreciated. Thanks!

score 5 · Answer 1 · edited May 23 '17 at 12:33

I had a similar question here, and the answer was to use a dirty hack. Matplotlib histogram with collection bin for high values

So with the following code, you get the ugly histogram you already have.

def plot_histogram_04():
    limit1, limit2 = 50, 550
    binwidth1, binwidth2 = 10, 50    
    data = np.hstack((np.random.rand(1000) * limit1, np.random.rand(100) * limit2))

    bins = range(0, limit1, binwidth1) + range(limit1, limit2, binwidth2)

    plt.subplots(1, 1)
    plt.hist(data, bins=bins)
    plt.savefig('my_plot_04.png')
    plt.close()

enter image description here

In order to make the bins equal width, you indeed have to make them equal width! This means manipulating your data such that they all fall in bins with equal width, and then play around with the xlabel.

def plot_histogram_05():
    limit1, limit2 = 50, 550
    binwidth1, binwidth2 = 10, 50

    data = np.hstack((np.random.rand(1000) * limit1, np.random.rand(100) * limit2))

    orig_bins = range(0, limit1, binwidth1) + range(limit1, limit2 + binwidth2, binwidth2)
    data = [(i - limit1) / (binwidth2 / binwidth1) + limit1 
            if i >= limit1 else i for i in data]
    bins = range(0, limit2 / (binwidth2 / binwidth1) + limit1, binwidth1)

    _, ax = plt.subplots(1, 1)
    plt.hist(data, bins=bins)

    xlabels = np.array(orig_bins, dtype='|S3')
    N_labels = len(xlabels)
    print xlabels
    print bins
    plt.xlim([0, bins[-1]])
    plt.xticks(binwidth1 * np.arange(N_labels))
    ax.set_xticklabels(xlabels)

    plt.savefig('my_plot_05.png')
    plt.close()

enter image description here

score 2 · Accepted Answer · edited May 23 '17 at 12:01

You can use bar and there is no need to split the axis. Here is an example,

import matplotlib.pylab as plt
import numpy as np

data = np.hstack((np.random.rand(1000)*50,np.random.rand(100)*500))
binwidth1,binwidth2=10,50
bins=range(0,50,binwidth1)+range(50,550,binwidth2)

fig,(ax) = plt.subplots(1, 1)

y,binEdges=np.histogram(data,bins=bins)

ax.bar(0.5*(binEdges[1:]+binEdges[:-1])[:5], y[:5],width=.8*binwidth1,align='center')
ax.bar(0.5*(binEdges[1:]+binEdges[:-1])[5:], y[5:],width=.8*binwidth1,align='center')
plt.show()

enter image description here

In case you really want to split the axis have a look here.

score 1 · Answer 3 · edited Oct 24 '18 at 10:18

1

import pandas as pd
import numpy as np

df= data

bins = np.arange(0,0.1,0.001)
df.hist(bins=bins,color='grey')

edited Oct 24 '18 at 10:18

danday74

52,471
49
232
283

answered Oct 24 '18 at 09:56

TVC

67
4

2

While this code may answer the question, providing additional context regarding why and/or how this code answers the question improves its long-term value. – cheersmate Oct 24 '18 at 11:07
I would tend to agree with this comment, but this answer is very nice, does exactly what is asked, and is also very simple to understand if you ever used the bins argument. – charelf Apr 14 '21 at 08:06

Defining bin width/x-axis scale in Matplotlib histogram

3 Answers3

Linked