Manipulating numpy arrays (concatenating inner sub-arrays)

Question

I have a question of manipulating numpy arrays. Say, given a 3-d array in the form np.array([[[1,2],[3,4]], [[5,6],[7,8]]]) which is a (2,2,2) array. I want to manipulate it into a (2,4) array such that a = np.array([[1,2,5,6],[3,4,7,8]]). I want to know is there any built-in methods of numpy particularly dealing with problems like this and can be easily generalized.

EDITED: Thank you all guys' answers. They all rock! I thought I should clarify what I mean by "easily generalized" in the original post. Suppose given a (6,3,2,3) array (this is the actual challenge I am facing)

a = array([[[[ 10,  20,  30],
         [ 40,  40,  20]],

        [[ 22,  44,  66],
         [ 88,  88,  44]],

        [[ 33,  66,  99],
         [132, 132,  66]]],


       [[[ 22,  44,  66],
         [ 88,  88,  44]],

        [[ 54, 108, 162],
         [216, 216, 108]],

        [[ 23,  46,  69],
         [ 92,  92,  46]]],


       [[[ 14,  28,  42],
         [ 56,  56,  28]],

        [[ 25,  50,  75],
         [100, 100,  50]],

        [[ 33,  66,  99],
         [132, 132,  66]]],


       [[[ 20,  40,  60],
         [ 80,  80,  40]],

        [[ 44,  88, 132],
         [176, 176,  88]],

        [[ 66, 132, 198],
         [264, 264, 132]]],


       [[[ 44,  88, 132],
         [176, 176,  88]],

        [[108, 216, 324],
         [432, 432, 216]],

        [[ 46,  92, 138],
         [184, 184,  92]]],


       [[[ 28,  56,  84],
         [112, 112,  56]],

        [[ 50, 100, 150],
         [200, 200, 100]],

        [[ 66, 132, 198],
         [264, 264, 132]]]])

I want to massage it into a (3,3,2,2,3) array such that fora[0,:,:,:,:]

a[0,0,0,:,:] = np.array([[10,20,30],[40,40,20]]);
a[0,1,0,:,:] = np.array([[22,44,66],[88,88,44]]);
a[0,2,0,:,:] = np.array([[33,66,99],[132,132,66]]);
a[0,0,1,:,:] = np.array([[20,40,60],[80,80,40]]);
a[0,1,1,:,:] = np.array([[44,88,132],[176,176,88]]);
a[0,2,1,:,:] = np.array([[66,132,198],[264,264,132]]).

In short, the last 3 biggest blocks should "merge" with first 3 biggest blocks to form 3 (3,2) blocks. The rest of 2 blocks i.e., (a[1,:,:,:,:], a[2,:,:,:,:]) follow the same pattern.

Be careful with what I asked, reshape doesn't solve my problem. ```reshape``` will simply give me ```np.array([[1,2,3,4],[5,6,7,8]])```. — CoolGas, Sep 01 '21 at 14:47
I don't understand how `np.reshape(a, (2,4))` is not producing exactly the expected output you show in the question. — joanis, Sep 01 '21 at 14:49
Oh, never mind, you're mucking with the order of the values in there. OK, that was subtle. Maybe you need to do some transposing of parts of the array first? But your transposition is certainly not a standard one, so I'm not sure how to accomplish it. — joanis, Sep 01 '21 at 14:50
I've tried many techniques, this question is harder than thought. Otherwise, I wouldn't bother to post this question. — CoolGas, Sep 01 '21 at 14:52
Found a solution with `zip`, but I like Albin Paul's answer better, with `swapaxes`. — joanis, Sep 01 '21 at 14:56
Look at `a`. With a shape of (2,2,2), there are 3 different ways you can combine values to create a (2,4) array. `reshape` joins the last 2 dimensions. `reshape(4,2)` joins the first 2. You want to join the first and last, which requires reordering the elements. Reread the `reshape` docs and pay attention to the "You can think of reshaping as first raveling the array ..." line. — hpaulj, Sep 01 '21 at 15:48
"easily generalized": in which way(s)? In [my answer](https://stackoverflow.com/a/69017010/758174) I put a link to the authoritative explanation for these problems. — Pierre D, Sep 01 '21 at 15:54
I guess you have fell into an [XY problem](https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem) tbh — AcaNg, Sep 02 '21 at 09:39
I think if you simply indicate the destination address of one element with all different indices (e.g.: `orig_a[4,3,2,1]` needs to be put into `new_a[i,j,k,m]`, it would simplify the description. Ideally, to make the definition unambiguous, choose indices that are prime numbers. — Pierre D, Sep 03 '21 at 12:37

hpaulj · Answer 1 · 2021-09-01T16:02:52.783

Your subject line answers your question:

In [813]: a
Out[813]: 
array([[[1, 2],
        [3, 4]],

       [[5, 6],
        [7, 8]]])
In [818]: np.concatenate(a, axis=1)    # aka np.hstack
Out[818]: 
array([[1, 2, 5, 6],
       [3, 4, 7, 8]])

This treats the arrays as 2 (2,2) sub-arrays.

The other concatenate option:

In [819]: np.concatenate(a, axis=0)
Out[819]: 
array([[1, 2],
       [3, 4],
       [5, 6],
       [7, 8]])

I think the transpose followed by reshape is better, and more easily generalized. But it requires some knowledge of how arrays are stored, and the meaning of the dimensions and the ways of transposing them.

The reason plain reshape doesn't work is that you want to reorder the elements of the array.

As documented, reshape effectively ravels the array, and then applies the new shape:

In [823]: a.ravel()
Out[823]: array([1, 2, 3, 4, 5, 6, 7, 8])

but you new array has a different order:

In [824]: np.concatenate(a, axis=1).ravel()
Out[824]: array([1, 2, 5, 6, 3, 4, 7, 8])

Albin Paul · Answer 2 · 2021-09-01T16:58:21.223

2

First swapping the axes, using np.swapaxes and then reshaping gets the output.

import numpy as np
a = np.array([[[1,2],[3,4]], [[5,6],[7,8]]])
a = np.swapaxes(a, 0, 1)
a = np.reshape(a, (2, 4))
print(a)

OUTPUT

[[1 2 5 6]
 [3 4 7 8]]

You can also use np.transpose like np.transpose(a, (1, 0, 2)) for swapping axes from (0, 1, 2) to (1, 0, 2) as pointed out by MadPhysicist.

edited Sep 01 '21 at 16:58

answered Sep 01 '21 at 14:55

Albin Paul

3,330
2
14
30

You can also use `.transpose(1, 0, 2)` – Mad Physicist Sep 01 '21 at 15:31
@MadPhysicist I have updated the answer with the suggestion. – Albin Paul Sep 01 '21 at 15:38

Pierre D · Answer 3 · 2021-09-04T00:05:57.730

I think in this case (first example), simply:

>>> a.swapaxes(0, 1).reshape(2, -1)
array([[1, 2, 5, 6],
       [3, 4, 7, 8]])

Generally speaking, I find @Divakar mini tutorial to be the authoritative source for these kinds of operations.

Edit After the question was updated (to contain an example with a larger array), I wrote a small solver (quite fast, actually) for these kinds of questions.

Any of the following produce the same result, which meets the constraints:

np.moveaxis(a.reshape(2, 3, 3, 2, 3), 0, 2)
np.rollaxis(a.reshape(2, 3, 3, 2, 3), 0, 3)
a.reshape(2, 3, 3, 2, 3).transpose(1, 2, 0, 3, 4)
a.reshape(2, 9, 6).swapaxes(0, 1).reshape(3, 3, 2, 2, 3)
np.rollaxis(a.reshape(2, 9, 6), 1).reshape(3, 3, 2, 2, 3)
a.reshape(2, 9, 2, 3).swapaxes(0, 1).reshape(3, 3, 2, 2, 3)
a.reshape(2, 9, 3, 2).swapaxes(0, 1).reshape(3, 3, 2, 2, 3)
a.reshape(2, 9, 6).transpose(1, 0, 2).reshape(3, 3, 2, 2, 3)
# ...

Of course, you can decide to change any single value in .reshape() by -1 to "make it more generic" or more intuitive. For instance:

np.rollaxis(a.reshape(2, 3, 3, 2, -1), 0, 3)

joanis · Answer 4 · 2021-09-01T14:55:32.857

0

numpy.reshape combined with zip can do what you want, but it's a bit wonky:

>>> a = np.array([[[1,2],[3,4]], [[5,6],[7,8]]])
>>> b = np.array(list(zip(a[0], a[1])))
>>> np.reshape(b, (2,4))
array([[1, 2, 5, 6],
       [3, 4, 7, 8]])

The challenge is that you're transposing the first and second dimension of a, which is not what np.transpose does. However, zip does that, effectively.

edited Sep 01 '21 at 14:55

answered Sep 01 '21 at 14:47

joanis

10,635
14
30
40

score 0 · Answer 5 · answered Sep 01 '21 at 15:00

0

You can use hstack and vstack.

    a= np.array([[[1,2],[3,4]], [[5,6],[7,8]]])
    s0, s1 = range(a.shape[0])
    
    x= np.vstack(np.hstack((a[s0], a[s1])))
    print(x)

Output

[[1 2 5 6]
 [3 4 7 8]]

answered Sep 01 '21 at 15:00

Tanvir

154
14

`np.hstack(a)` is all you need. It treats the arrays as a list of 2 (2,2) arrays. – hpaulj Sep 01 '21 at 15:42

AcaNg · Accepted Answer · 2021-09-02T09:43:25.347

From your new update, you can do the following using np.lib.stride_tricks.as_strided:

>>> np.lib.stride_tricks.as_strided(a, shape=(3,3,2,2,3), strides=(72,24,216,12,4))
array([[[[[ 10,  20,  30],
          [ 40,  40,  20]],

         [[ 20,  40,  60],
          [ 80,  80,  40]]],


        [[[ 22,  44,  66],
          [ 88,  88,  44]],

         [[ 44,  88, 132],
          [176, 176,  88]]],


        [[[ 33,  66,  99],
          [132, 132,  66]],

         [[ 66, 132, 198],
          [264, 264, 132]]]],



       [[[[ 22,  44,  66],
          [ 88,  88,  44]],

         [[ 44,  88, 132],
          [176, 176,  88]]],


        [[[ 54, 108, 162],
          [216, 216, 108]],

         [[108, 216, 324],
          [432, 432, 216]]],


        [[[ 23,  46,  69],
          [ 92,  92,  46]],

         [[ 46,  92, 138],
          [184, 184,  92]]]],



       [[[[ 14,  28,  42],
          [ 56,  56,  28]],

         [[ 28,  56,  84],
          [112, 112,  56]]],


        [[[ 25,  50,  75],
          [100, 100,  50]],

         [[ 50, 100, 150],
          [200, 200, 100]]],


        [[[ 33,  66,  99],
          [132, 132,  66]],

         [[ 66, 132, 198],
          [264, 264, 132]]]]])

Explanation:

Take another example: a small array q and our desired output after changing q:

>>> q = np.arange(12).reshape(4,3,-1)
>>> q
array([[[ 0],
        [ 1],
        [ 2]],

       [[ 3],
        [ 4],
        [ 5]],

       [[ 6],
        [ 7],
        [ 8]],

       [[ 9],
        [10],
        [11]]])
# desired output:
# shape = (2, 3, 2)
array([[[ 0,  6],
        [ 1,  7],
        [ 2,  8]],

       [[ 3,  9],
        [ 4, 10],
        [ 5, 11]]])

Here we are using numpy strides to achieve this. Let's check for q's strides:

>>> q.strides
(12, 4, 4)

In our output, all strides should remain the same, except the third stride, because in the third dimension we need to stack with the values from bottom half of q, ie: 6 is put next to 0, 7 next to 1 and so on...

So, how "far" is it from 0 to 6 ? Or in another word, how far is it from q[0,0,0] to q[2,0,0] ?

# obviously, distance = [2,0,0] - [0,0,0] = [2,0,0]
bytedistance = np.sum(np.array([2,0,0])*q.strides)
# 2*12 + 0*4 + 0*4 = 24 bytes

Okay then new_strides = (12, 4, 24) and hence we got:

>>> np.lib.stride_tricks.as_strided(q, shape=(2,3,2), strides=new_strides)
array([[[ 0,  6],
        [ 1,  7],
        [ 2,  8]],

       [[ 3,  9],
        [ 4, 10],
        [ 5, 11]]])

Back to your question:

a.strides = (72,24,12,4)
new_strides = (72,24,216,12,4)     # why is 216 here ? it's a homework :)
new_a = np.lib.stride_tricks.as_strided(a, shape=(3,3,2,2,3), strides=new_strides)

Your answer rocks. I will accept it as the most generalizable solution in my case. — CoolGas, Sep 02 '21 at 10:17
@CoolGas I recommend you to read through the doc in the link above: `as_strided` is considered dangerous if used wrongly. You need to carefully calculate the strides, and if you don't, your program may crash. — AcaNg, Sep 02 '21 at 10:21
Yea, totally agreed. The numpy page does stress the fact that this method should only be used in extreme case. Do appreciate that you pointed it out. — CoolGas, Sep 02 '21 at 10:29

Manipulating numpy arrays (concatenating inner sub-arrays)

6 Answers6