0

I have a given array with a length of over 1'000'000 and values between 0 and 255 (included) as integers. Now I would like to plot on the x-axis the integers from 0 to 255 and on the y-axis the quantity of the corresponding x value in the given array (called Arr in my current code).

I thought about this code:

    list = []
    for i in range(0, 256):
        icounter = 0
        for x in range(len(Arr)):
            if Arr[x] == i:
                icounter += 1
        list.append(icounter)

But is there any way I can do this a little bit faster (it takes me several minutes at the moment)? I thought about an import ..., but wasn't able to find a good package for this.

leftaroundabout
  • 117,950
  • 5
  • 174
  • 319
G4W
  • 53
  • 9
  • 1
    You're essentially just trying to make a histogram, so [`numpy.histogram`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.histogram.html) or [`scipy.stats.histogram`](https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.stats.histogram.html) will be very efficient and accomplish this task – Cory Kramer Sep 11 '17 at 13:22
  • @CoryKramer Thx it works. – G4W Sep 11 '17 at 13:34
  • This question was already answered here https://stackoverflow.com/questions/10741346/numpy-most-efficient-frequency-counts-for-unique-values-in-an-array – Anil_M Sep 11 '17 at 13:37
  • @Anil_M oh I havn't been finding this one... – G4W Sep 11 '17 at 13:39

2 Answers2

1

Use numpy.bincount for this task (look for more details here)

import numpy as np
list = np.bincount(Arr)
zimmerrol
  • 4,872
  • 3
  • 22
  • 41
0

While I completely agree with the previous answers that you should use a standard histogram algorithm, it's quite easy to greatly speed up your own implementation. Its problem is that you pass through the entire input for each bin, over and over again. It would be much faster to only process the input once, and then write only to the relevant bin:

def hist(arr):
    nbins = 256
    result = [0] * nbins   # or np.zeroes(nbins)
    for y in arr:
        if y>=0 and y<nbins:
            result[y] += 1
    return result
leftaroundabout
  • 117,950
  • 5
  • 174
  • 319