3

I have a numpy array that is interleaved in a tricky way and I can't figure out an easy way to de-interleave it. Suppose the (84, 132) matrix is:

0   100  200 ...
1   101  201 ...
2   102  202 ...
...
83  183  283 ...

I want to take every fourth element from the first column, then every fourth element starting from the second row, then every fourth starting from the third row, then every fourth starting from the fourth row. (Yielding four new columns.) Then I want to repeat for the second column, and so forth. So the (21, 528) result I want is:

0   1  2  3 100 101 102 103 200 ...
4   5  6  7 104 105 106 107 204 ...
8   9 10 11 108 109 110 111 208 ...
...
80 81 82 83 180 181 182 183 283 ...

I can do this with a loop, converting the (84, 132) array a to a (21, 528) array b:

b = np.zeros(shape=(21, 132*4))
for y in range(0, 21):
  for x in range(0, 132):
    for s in range(0, 4):
      b[y, x * 4 + s] = a[y * 4 + s, x]

Is there a nicer way to do this with numpy operations?

(Context: this is the physical arrangement of the microcode ROM in the 8086 processor and I'm trying to unshuffle the bits for analysis.)

Ken Shirriff
  • 1,654
  • 16
  • 20

2 Answers2

4

Permute axes and reshape with the idea being borrowed off General idea for nd to nd transformation. -

N = 4 # number of rows to split with
n = a.shape[1]
a.reshape(-1,N,n).swapaxes(1,2).reshape(-1,n*N)
Divakar
  • 218,885
  • 19
  • 262
  • 358
1

You could do something like:

#!/usr/bin/env python                                                                               

import numpy as np

# construct test data                                                                               
i = np.arange(132)
j = np.arange(84)
ii, jj = np.meshgrid(i, j)
a = 100 * ii + jj

# the operation                                                                                     
n0, n1 = a.shape
m = 4
b = np.concatenate([a[:,i].reshape((n0 // m, m)) for i in range(n1)], axis=1)

gives:

>>> a
array([[    0,   100,   200, ..., 12900, 13000, 13100],
       [    1,   101,   201, ..., 12901, 13001, 13101],
       [    2,   102,   202, ..., 12902, 13002, 13102],
       ...,
       [   81,   181,   281, ..., 12981, 13081, 13181],
       [   82,   182,   282, ..., 12982, 13082, 13182],
       [   83,   183,   283, ..., 12983, 13083, 13183]])
>>> b
array([[    0,     1,     2, ..., 13101, 13102, 13103],
       [    4,     5,     6, ..., 13105, 13106, 13107],
       [    8,     9,    10, ..., 13109, 13110, 13111],
       ...,
       [   72,    73,    74, ..., 13173, 13174, 13175],
       [   76,    77,    78, ..., 13177, 13178, 13179],
       [   80,    81,    82, ..., 13181, 13182, 13183]])

A bit hard to see what is going on where elements are omitted above, so here is another case with a smaller array (8x12) where all elements can be shown

array([[  0, 100, 200, 300, 400, 500, 600, 700],
       [  1, 101, 201, 301, 401, 501, 601, 701],
       [  2, 102, 202, 302, 402, 502, 602, 702],
       [  3, 103, 203, 303, 403, 503, 603, 703],
       [  4, 104, 204, 304, 404, 504, 604, 704],
       [  5, 105, 205, 305, 405, 505, 605, 705],
       [  6, 106, 206, 306, 406, 506, 606, 706],
       [  7, 107, 207, 307, 407, 507, 607, 707],
       [  8, 108, 208, 308, 408, 508, 608, 708],
       [  9, 109, 209, 309, 409, 509, 609, 709],
       [ 10, 110, 210, 310, 410, 510, 610, 710],
       [ 11, 111, 211, 311, 411, 511, 611, 711]])

array([[  0,   1,   2,   3, 100, 101, 102, 103, 200, 201, 202, 203, 300, 301, 302, 303, 400, 401, 402, 403, 500, 501, 502, 503, 600, 601, 602, 603, 700, 701, 702, 703],
       [  4,   5,   6,   7, 104, 105, 106, 107, 204, 205, 206, 207, 304, 305, 306, 307, 404, 405, 406, 407, 504, 505, 506, 507, 604, 605, 606, 607, 704, 705, 706, 707],
       [  8,   9,  10,  11, 108, 109, 110, 111, 208, 209, 210, 211, 308, 309, 310, 311, 408, 409, 410, 411, 508, 509, 510, 511, 608, 609, 610, 611, 708, 709, 710, 711]])
alani
  • 12,573
  • 2
  • 13
  • 23
  • I think concatenation, which is typically quite expensive, is not needed. – norok2 Jun 18 '20 at 22:09
  • Thanks, that solution gives the matrix I wanted. – Ken Shirriff Jun 18 '20 at 22:14
  • @KenShirriff I think they both work, but the other solution is more efficient than mine. I tried a 12000x8000 input array, and on my machine Divakar's was taking 0.6 seconds and mine was taking 3 to 4 seconds. – alani Jun 18 '20 at 22:24