I've a question about rebinning a list of numbers, with a desired bin-width. It's basically what a frequency histogram does, but I don't want the plot, just the bin number and the number of occurrences for each bin.
So far I've already written some code that does what I want, but it's not very efficient. Given a list a
, in order to rebin it with a bin-width equal to 3, I've written the following:
import os, sys, math
import numpy as np
# list of numbers
a = list(range(3000))
# number of entries
L = int(len(a))
# desired bin width
W = 3
# number of bins with width W
N = int(L/W)
# definition of new empty array
a_rebin = np.zeros((N, 2))
# cycles to populate the new rebinned array
for n in range(0,N):
k = 0
for i in range(0,L):
if a[i] >= (W*n) and a[i] < (W+W*n):
k = k+1
a_rebin[n]=[W*n,k]
# print
print a_rebin
Now, this does exactly what I want, but I think it's not so smart, as it reads the whole list N
times, with N
number of bins. It's fine for small lists. But, as I have to deal with very large lists and rather small bin-widths, this translates into huge values of N
and the whole process takes a very long time (hours...). Do you have any ideas to improve this code? Thank you in advance!