How to de-interleave array in numpy?

Question

I have a numpy array that is interleaved in a tricky way and I can't figure out an easy way to de-interleave it. Suppose the (84, 132) matrix is:

0   100  200 ...
1   101  201 ...
2   102  202 ...
...
83  183  283 ...

I want to take every fourth element from the first column, then every fourth element starting from the second row, then every fourth starting from the third row, then every fourth starting from the fourth row. (Yielding four new columns.) Then I want to repeat for the second column, and so forth. So the (21, 528) result I want is:

0   1  2  3 100 101 102 103 200 ...
4   5  6  7 104 105 106 107 204 ...
8   9 10 11 108 109 110 111 208 ...
...
80 81 82 83 180 181 182 183 283 ...

I can do this with a loop, converting the (84, 132) array a to a (21, 528) array b:

b = np.zeros(shape=(21, 132*4))
for y in range(0, 21):
  for x in range(0, 132):
    for s in range(0, 4):
      b[y, x * 4 + s] = a[y * 4 + s, x]

Is there a nicer way to do this with numpy operations?

(Context: this is the physical arrangement of the microcode ROM in the 8086 processor and I'm trying to unshuffle the bits for analysis.)

score 4 · Accepted Answer · answered Jun 18 '20 at 22:05

4

Permute axes and reshape with the idea being borrowed off General idea for nd to nd transformation. -

N = 4 # number of rows to split with
n = a.shape[1]
a.reshape(-1,N,n).swapaxes(1,2).reshape(-1,n*N)

answered Jun 18 '20 at 22:05

Divakar

218,885
19
262
358

alani · Answer 2 · 2020-06-18T22:11:07.687

You could do something like:

#!/usr/bin/env python                                                                               

import numpy as np

# construct test data                                                                               
i = np.arange(132)
j = np.arange(84)
ii, jj = np.meshgrid(i, j)
a = 100 * ii + jj

# the operation                                                                                     
n0, n1 = a.shape
m = 4
b = np.concatenate([a[:,i].reshape((n0 // m, m)) for i in range(n1)], axis=1)

gives:

>>> a
array([[    0,   100,   200, ..., 12900, 13000, 13100],
       [    1,   101,   201, ..., 12901, 13001, 13101],
       [    2,   102,   202, ..., 12902, 13002, 13102],
       ...,
       [   81,   181,   281, ..., 12981, 13081, 13181],
       [   82,   182,   282, ..., 12982, 13082, 13182],
       [   83,   183,   283, ..., 12983, 13083, 13183]])

>>> b
array([[    0,     1,     2, ..., 13101, 13102, 13103],
       [    4,     5,     6, ..., 13105, 13106, 13107],
       [    8,     9,    10, ..., 13109, 13110, 13111],
       ...,
       [   72,    73,    74, ..., 13173, 13174, 13175],
       [   76,    77,    78, ..., 13177, 13178, 13179],
       [   80,    81,    82, ..., 13181, 13182, 13183]])

A bit hard to see what is going on where elements are omitted above, so here is another case with a smaller array (8x12) where all elements can be shown

array([[  0, 100, 200, 300, 400, 500, 600, 700],
       [  1, 101, 201, 301, 401, 501, 601, 701],
       [  2, 102, 202, 302, 402, 502, 602, 702],
       [  3, 103, 203, 303, 403, 503, 603, 703],
       [  4, 104, 204, 304, 404, 504, 604, 704],
       [  5, 105, 205, 305, 405, 505, 605, 705],
       [  6, 106, 206, 306, 406, 506, 606, 706],
       [  7, 107, 207, 307, 407, 507, 607, 707],
       [  8, 108, 208, 308, 408, 508, 608, 708],
       [  9, 109, 209, 309, 409, 509, 609, 709],
       [ 10, 110, 210, 310, 410, 510, 610, 710],
       [ 11, 111, 211, 311, 411, 511, 611, 711]])

array([[  0,   1,   2,   3, 100, 101, 102, 103, 200, 201, 202, 203, 300, 301, 302, 303, 400, 401, 402, 403, 500, 501, 502, 503, 600, 601, 602, 603, 700, 701, 702, 703],
       [  4,   5,   6,   7, 104, 105, 106, 107, 204, 205, 206, 207, 304, 305, 306, 307, 404, 405, 406, 407, 504, 505, 506, 507, 604, 605, 606, 607, 704, 705, 706, 707],
       [  8,   9,  10,  11, 108, 109, 110, 111, 208, 209, 210, 211, 308, 309, 310, 311, 408, 409, 410, 411, 508, 509, 510, 511, 608, 609, 610, 611, 708, 709, 710, 711]])

I think concatenation, which is typically quite expensive, is not needed. — norok2, Jun 18 '20 at 22:09
@KenShirriff I think they both work, but the other solution is more efficient than mine. I tried a 12000x8000 input array, and on my machine Divakar's was taking 0.6 seconds and mine was taking 3 to 4 seconds. — alani, Jun 18 '20 at 22:24

How to de-interleave array in numpy?

2 Answers2

Linked