5

This is a sample of what I am trying to accomplish. I am very new to python and have searched for hours to find out what I am doing wrong. I haven't been able to find what my issue is. I am still new enough that I may be searching for the wrong phrases. If so, could you please point me in the right direction?

I want to combine n mumber of arrays to make one array. I want to have the first row from x as the first row in the combined the first row from y as the second row in combined, the first row from z as the third row in combined the the second row in x as the fourth row in combined, etc. so I would look something like this.

x = [x1 x2 x3]
    [x4 x5 x6]
    [x7 x8 x9]

y = [y1 y2 y3]
    [y4 y5 y6]
    [y7 y8 y9]

x = [z1 z2 z3]
    [z4 z5 z6]
    [z7 z8 z9]

combined = [x1 x2 x3]
           [y1 y2 y3]
           [z1 z2 z3]
           [x4 x5 x6]
           [...]
           [z7 z8 z9]

The best I can come up with is the

    import numpy as np

x = np.random.rand(6,3)
y = np.random.rand(6,3)
z = np.random.rand(6,3)

combined = np.zeros((9,3))

for rows in range(len(x)):        
    combined[0::3] = x[rows,:] 
    combined[1::3] = y[rows,:]
    combined[2::3] = z[rows,:]


print(combined)

All this does is write the last value of the input array to every third row in the output array instead of what I wanted. I am not sure if this is even the best way to do this. Any advice would help out.

*I just figure out this works but if someone knows a higher performance method, *please let me know.

import numpy as np

x = np.random.rand(6,3) 
y = np.random.rand(6,3) 
z = np.random.rand(6,3)

combined = np.zeros((18,3))

for rows in range(6):        
  combined[rows*3,:] = x[rows,:] 
  combined[rows*3+1,:] = y[rows,:]
  combined[rows*3+2,:] = z[rows,:]

  print(combined)
Ash Sharma
  • 470
  • 3
  • 18
chad jensen
  • 85
  • 1
  • 7

5 Answers5

4

You can do this using a list comprehension and zip:

combined = np.array([row for row_group in zip(x, y, z) for row in row_group])
Jundiaius
  • 6,214
  • 3
  • 30
  • 43
  • You're a life saver. I understand row is the iterative value referring to row count, I am not sure what row_group is iterating in this code though. – chad jensen Jun 25 '18 at 08:45
  • Performance note: `zip` with NumPy arrays isn't efficient, e.g. see [this answer](https://stackoverflow.com/a/50399219/9209546). – jpp Jun 25 '18 at 08:57
  • I used timeit for both. I got 576 us for the zip and 589 us with the for loop close enough to the same time that I would consider them the same in my application. At this point it is more in the interest of learning more about python, JPP do you have an alternate bit of code I can test? – chad jensen Jun 25 '18 at 09:31
  • @chadjensen, For small arrays, sure, it's irrelevant. For larger arrays, you should aim for a vectorised solution. I offered one in my answer. – jpp Jun 25 '18 at 09:42
  • this a smart and beautiful one ! – Cyryl1972 May 07 '21 at 06:52
3

Using vectorised operations only:

A = np.vstack((x, y, z))
idx = np.arange(A.shape[0]).reshape(-1, x.shape[0]).T.flatten()

A = A[idx]

Here's a demo:

import numpy as np

x, y, z = np.random.rand(3,3), np.random.rand(3,3), np.random.rand(3,3)

print(x, y, z)

[[ 0.88259564  0.17609363  0.01067734]
 [ 0.50299357  0.35075811  0.47230915]
 [ 0.751129    0.81839586  0.80554345]]
[[ 0.09469396  0.33848691  0.51550685]
 [ 0.38233976  0.05280427  0.37778962]
 [ 0.7169351   0.17752571  0.49581777]]
[[ 0.06056544  0.70273453  0.60681583]
 [ 0.57830566  0.71375038  0.14446909]
 [ 0.23799775  0.03571076  0.26917939]]

A = np.vstack((x, y, z))
idx = np.arange(A.shape[0]).reshape(-1, x.shape[0]).T.flatten()

print(idx)  # [0 3 6 1 4 7 2 5 8]

A = A[idx]

print(A)

[[ 0.88259564  0.17609363  0.01067734]
 [ 0.09469396  0.33848691  0.51550685]
 [ 0.06056544  0.70273453  0.60681583]
 [ 0.50299357  0.35075811  0.47230915]
 [ 0.38233976  0.05280427  0.37778962]
 [ 0.57830566  0.71375038  0.14446909]
 [ 0.751129    0.81839586  0.80554345]
 [ 0.7169351   0.17752571  0.49581777]
 [ 0.23799775  0.03571076  0.26917939]]
jpp
  • 159,742
  • 34
  • 281
  • 339
  • Slightly faster than zip at 576 us per loop. This is harder for a noob like me to understand. I will have to research the parts of the command to see if I can grasp it. – chad jensen Jun 25 '18 at 09:39
  • It's not too complicated, the main thing you need to understand is how `numpy.arange` (get a range as an array), `numpy.reshape` (reshape as a 2d array here), and `flatten` (make into a single dimension array). Combine to get the indices `[0 3 6 1 4 7 2 5 8]`. – jpp Jun 25 '18 at 09:41
  • 1
    how about `stack`with axis 1, plus `reshape`? – hpaulj Jun 25 '18 at 09:50
  • Thank you for sharing your knowledge with me. I still have so much to learn and it is great there are people that are willing to help out. – chad jensen Jun 25 '18 at 09:52
  • @hpaulj, Can you expand? (Or, if you like, post a new solution.) – jpp Jun 25 '18 at 09:57
2

I have changed your code a little bit to get the desired output

import numpy as np

x = np.random.rand(6,3)
y = np.random.rand(6,3)
z = np.random.rand(6,3)

combined = np.zeros((18,3))

combined[0::3] = x
combined[1::3] = y
combined[2::3] = z

print(combined)

You had the shape of the combined matrix wrong and there is no real need for the for loop.

Ash Sharma
  • 470
  • 3
  • 18
0

This might not be the most pythonic way to do it but you could

for block in range(len(combined)/3):
    for rows in range(len(x)):

    combined[block*3+0::3] = x[rows,:] 
    combined[block*3+1::3] = y[rows,:]
    combined[block*3+2::3] = z[rows,:]
zar3bski
  • 2,773
  • 7
  • 25
  • 58
0

A simple numpy solution is to stack the arrays on a new middle axis, and reshape the result to 2d:

In [5]: x = np.arange(9).reshape(3,3)
In [6]: y = np.arange(9).reshape(3,3)+10
In [7]: z = np.arange(9).reshape(3,3)+100
In [8]: np.stack((x,y,z),axis=1).reshape(-1,3)
Out[8]: 
array([[  0,   1,   2],
       [ 10,  11,  12],
       [100, 101, 102],
       [  3,   4,   5],
       [ 13,  14,  15],
       [103, 104, 105],
       [  6,   7,   8],
       [ 16,  17,  18],
       [106, 107, 108]])

It may be easier to see what's happening if we give each dimension a different value; e.g. 2 3x4 arrays:

In [9]: x = np.arange(12).reshape(3,4)
In [10]: y = np.arange(12).reshape(3,4)+10

np.array combines them on a new 1st axis, making a 2x3x4 array. To get the interleaving you want, we can transpose the first 2 dimensions, producing a 3x2x4. Then reshape to a 6x4.

In [13]: np.array((x,y))
Out[13]: 
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[10, 11, 12, 13],
        [14, 15, 16, 17],
        [18, 19, 20, 21]]])
In [14]: np.array((x,y)).transpose(1,0,2)
Out[14]: 
array([[[ 0,  1,  2,  3],
        [10, 11, 12, 13]],

       [[ 4,  5,  6,  7],
        [14, 15, 16, 17]],

       [[ 8,  9, 10, 11],
        [18, 19, 20, 21]]])
In [15]: np.array((x,y)).transpose(1,0,2).reshape(-1,4)
Out[15]: 
array([[ 0,  1,  2,  3],
       [10, 11, 12, 13],
       [ 4,  5,  6,  7],
       [14, 15, 16, 17],
       [ 8,  9, 10, 11],
       [18, 19, 20, 21]])

np.vstack produces a 6x4, but with the wrong order. We can't transpose that directly.

np.stack with default axis behaves just like np.array. But with axis=1, it creates a 3x2x4, which we can reshape:

In [16]: np.stack((x,y), 1)
Out[16]: 
array([[[ 0,  1,  2,  3],
        [10, 11, 12, 13]],

       [[ 4,  5,  6,  7],
        [14, 15, 16, 17]],

       [[ 8,  9, 10, 11],
        [18, 19, 20, 21]]])

The list zip in the accepted answer is a list version of transpose, creating a list of 3 2-element tuples.

In [17]: list(zip(x,y))
Out[17]: 
[(array([0, 1, 2, 3]), array([10, 11, 12, 13])),
 (array([4, 5, 6, 7]), array([14, 15, 16, 17])),
 (array([ 8,  9, 10, 11]), array([18, 19, 20, 21]))]

np.array(list(zip(x,y))) produces the same thing as the stack, a 3x2x4 array.


As for speed, I suspect the allocate and assign (as in Ash's answer) is fastest:

In [27]: z = np.zeros((6,4),int)
    ...: for i, arr in enumerate((x,y)):
    ...:     z[i::2,:] = arr
    ...:     
In [28]: z
Out[28]: 
array([[ 0,  1,  2,  3],
       [10, 11, 12, 13],
       [ 4,  5,  6,  7],
       [14, 15, 16, 17],
       [ 8,  9, 10, 11],
       [18, 19, 20, 21]])

For serious timings, use much larger examples than this.

hpaulj
  • 221,503
  • 14
  • 230
  • 353