1

Given an NxM matrix in NumPY, I wish to down-sample this to an NxO matrix (O <<< M) such that the values in the NxO matrix are linearly interpolated from equally spaced samples in the original matrix.

As an example, consider a 3x10 matrix:

[
    [1  2  3  4  5  6  7  8  9  10]
    [10 9  8  7  6  5  4  3  2  1 ]
    [4  6  4  6  4  6  4  6  4  6 ]
]

If I were to down-sample this to a 3x4 matrix, the values might align like so:

1   2   3   4   5   6   7   8   9   10
|---|---|---|---|---|---|---|---|---|
       *      *       *      *      
       1      2       3      4

In general given M original elements being down-sampled to O new elements, the first element should be sampled from (M-1)/(O+1) with additional samples being taken at steps of (M-1)/(O+1). This can be seen in the image above, where 10 original elements yields 9 "gaps" between the elements. We wish to divide this distance of 9 "gaps" into 5 equals parts (leaving equal space on the left and right with equal spacing between each of the elements). So each new element is 9/5 = 1.8 "gaps" apart:

  • New element 0 = Old element 1.8
  • New element 1 = Old element 3.6
  • New element 2 = Old element 5.4
  • New element 3 = Old element 7.2

Using basic linear interpolation, we can say that "element 1.8" is 80% of element 2 plus 20% of element 1

Therefore my final matrix would look like so:

[
    [2.8 4.6 6.4 8.2]
    [8.2 6.4 4.6 2.8]
    [4.4 4.8 5.2 5.6]
]

I considered just writing a function to compute the output values and using np.apply_along_axis() method, but then I saw this StackOverflow post saying that doing so is just a flimsy wrapper around a for-loop and you're better off vectorizing your function.

So how would one vectorize this? Can it be done?

stevendesu
  • 15,753
  • 22
  • 105
  • 182

1 Answers1

1

Try this function

def downsample(m, samples):
    weights = np.zeros((m.shape[1], samples))
    for n in range(samples):
        pos = ((m.shape[1] - 1) / (samples + 1)) * (n + 1)
        if pos == np.floor(pos):
            weights[int(np.floor(pos)), n] = 1
        else:
            weights[int(np.ceil(pos)), n] = pos - int(np.floor(pos))
            weights[int(np.floor(pos)), n] = int(np.ceil(pos)) - pos
    return np.matmul(m, weights)

It creates a weight matrix based on the interpolation you described, then applies that weight to the entire matrix.

Groger
  • 532
  • 3
  • 15
  • Will do a few more matrices by hand to verify the results, and try to read and make sense of the code, but I just tested it on my sample 3x10 matrix from the question and it got the right answer -- so it looks good :) – stevendesu Oct 30 '19 at 17:39
  • 1
    Just tried using the same 3x10 sample matrix and downsampling to 5 instead of 4, and instead of the expected `[2.5, 4, 5.5, 7, 8.5]` for the first row, it returned `[2.5, 0, 5.5, 0, 8.5]`. So something wonky is going on when the weight is an exact integer – stevendesu Oct 30 '19 at 17:41
  • 1
    Worked like a charm, and having read through how it works (and remembering my matrix multiplication from god knows how many years ago) I also started to develop a slightly more elegant and faster solution using `np.diag` to compute the `weights` matrix, but I'm not quite there yet. Will accept this and add a new answer if I ever figure out the solution I'm working on. QUICK UPDATE: I gave up on my solution when I realized I was thinking about it wrong. There may be a faster way to generate the `weights` matrix, but what I was doing wasn't going to work. – stevendesu Oct 30 '19 at 17:53