2

Here is an example of what I intend to do. I have input data, something like this:

data = array([0,1,2,3,4,5,6,7,8,9])

What I need to do is to so sum up the first two values, then next two values and so on. So the result should be as following.

result = array([1,5,9,13,17])

This is what I am terming as 'squeezing'

I have a numpy array of size 4096. I need to squeeze the array to size of 2048,1024 or 512. It is actually energy spectrum where the indices of array give ADC channel no. and the value gives photon counts. To do this, here is what I do.

 # tdspec is input array of size 4096
 spec_bin_size = 1024
 spect = n.zeros(spec_bin_size)
 i = 0
 j=0
 interval = 4096/spec_bin_size
 while j < spec_bin_size:
     a = n.sum(tdspec[i:i+interval])
     spect[j] = a
     i = i+interval
     j = j+1

It works well, but now I need to run this loop upon a number of spectra and I fear that it will be slow. Can anyone please tell me if there is a numpy/scipy operation that does this work?

Ishan Tomar
  • 1,488
  • 1
  • 16
  • 20
  • FWIW, the word "squeeze" as used in numpy and scipy tends to mean "strip dimensions of size 1", so `np.array([[[1, 2, 3, 4]]])` is squeezed to `np.array([1, 2, 3, 4])` – Eric Aug 23 '16 at 06:51
  • Thanks for the info Can you tell me what the operation mentioned in the question should be called ? – Ishan Tomar Aug 23 '16 at 18:31
  • "sum-pooling" or "binning" might be a better word – Eric Aug 23 '16 at 20:52

1 Answers1

4

Let's define your array:

>>> import numpy as np
>>> data = np.array([0,1,2,3,4,5,6,7,8,9])

To get the same that you want:

>>> np.sum(data.reshape(5, 2), axis=1)
array([ 1,  5,  9, 13, 17])

Or, if we want reshape to calculate one of the dimensions for us, specify -1 for that dimension:

>>> np.sum(data.reshape(-1, 2), axis=1)
array([ 1,  5,  9, 13, 17])

>>> np.sum(data.reshape(5, -1), axis=1)
array([ 1,  5,  9, 13, 17])

Using scipy

Scipy has a variety available. To emulate the filter above, for example:

>>> import scipy.ndimage.filters as filters
>>> filters.convolve1d(data, [1,1])
array([ 1,  3,  5,  7,  9, 11, 13, 15, 17, 18])

Or, just selecting every other element to get the previous results:

>>> filters.convolve1d(data, [1,1])[::2]
array([ 1,  5,  9, 13, 17])

For other filters, see the scipy.ndimage.filters.

The top hat filter that you are using can result in spurious artifacts. A gaussian filter is often a better choice.

John1024
  • 109,961
  • 14
  • 137
  • 171
  • 1
    Thats what I was looking for. Thanks! But instead of 5 if we put -1, we can get rid of dealing with the number every time I need to change number of elements to sum up (here, 2. In my case: 512,1024 or 2048) I got this info from an answer of the following question: http://stackoverflow.com/questions/21921178/binning-a-numpy-array – Ishan Tomar Aug 23 '16 at 06:10
  • @John1024: Right, but as @Ishan correctly points out, numpy will calculate 4096 / n for you if you do `.reshape(-1, n)` – Eric Aug 23 '16 at 06:53
  • `filters.convolve1d` can also just be spelt `np.convolve` – Eric Aug 23 '16 at 06:53
  • Ishan, that is a good find: answer updated. @Eric, yes, convolve and convolve1d only differ on multidimensional arrays. – John1024 Aug 23 '16 at 07:21
  • @John1024: Is that even true? I thought they only differ in their handling of the `mode` parameter – Eric Aug 23 '16 at 08:13
  • @Eric: `convolve1d` takes an `axis` parameter and `convolve` does not. `convolve` accepts multidimensional weights and `convolve1d` does not. Given the similarities between the two and if the code design was up to me, I probably would have made just one function. – John1024 Aug 23 '16 at 17:12
  • @John: I think you're confusing some of `scipy.signal.convolve`, `scipy.ndimage.filters.convolve`, and `np.convolve`. I'm talking about the latter, which does not take multidimensional arrays. – Eric Aug 23 '16 at 18:22
  • @eric The filter module that I was referring to is the module linked in the answer: scipy.ndimage.filters. If you were referring to some other module, sorry for the confusion. – John1024 Aug 23 '16 at 18:29