0

I want to stack n number of columns as new rows at the end of the array. I was trying to do it with reshape but I couldn't make it. For example given the table

table = np.array([[11,12,13,14,15,16],
                  [21,22,23,24,25,26],
                  [31,32,33,34,35,36],
                  [41,42,43,44,45,46]])

my output if n=2 should be:

array([[11, 12], 
       [21, 22], 
       [31, 32], 
       [41, 42], 
       [13, 14], 
       [23, 24], 
       [33, 34], 
       [43, 44], 
       [15, 16], 
       [25, 26], 
       [35, 36], 
       [45, 46]])

and if n=3:

array([[11, 12, 13], 
       [21, 22, 23], 
       [31, 32, 33], 
       [41, 42, 43], 
       [14, 15, 16], 
       [24, 25. 26], 
       [34, 35, 36], 
       [44, 45, 46]])

Update:

Ok,I managed to achieve the result that I want with the following command:

import numpy as np

n=2
np.concatenate(np.split(table, table.shape[1]/n, axis=1), axis=0)
n=3
np.concatenate(np.split(table, table.shape[1]/n, axis=1), axis=0)

I do not know though if it is possible to be done with reshape.

ttsesm
  • 917
  • 5
  • 14
  • 28
  • The best way to answer if `arr1` can be reshaped into `arr2` is to check if `arr1.flatten()` is equal to `arr2.flatten()`. Clearly not in your case. So if you expect to use a single `reshape(...)` method, you'll need to use additionally something else (it might be double usage of `reshape(..)` in some scenarios) – mathfux Oct 20 '20 at 12:35
  • Well in the beginning I was trying to figure out whether I could use the solutions from this thread https://stackoverflow.com/questions/55444777/numpy-array-stack-multiple-columns-into-one-using-reshape where they also use `flatten()` but apparently as you mention most likely it is not possible. – ttsesm Oct 20 '20 at 12:39
  • Well, at least you can apply some additional methods such as `np.transpose`, `np.squeeze`, `np.swapaxis`. You'll check also a nice example of [`maxpooling`](https://stackoverflow.com/a/42463514/3044825) which is quite close to your problem. I'll try to adapt it to your problem. – mathfux Oct 20 '20 at 12:49
  • 1
    Instead of using concatenate you could use np.row_stack: np.row_stack(np.split(table, table.shape[1]/n, axis=1)). Admittedly, the only benefit of this is no need to specify axis. It's just one more potential avenue to use. Personally, I see no compelling need to use reshape here, what you have is fine. – tnknepp Oct 20 '20 at 12:54
  • Yup, most likely you are right. The point was that I've related it to `reshape` because that was the first command that it came to my mind to use but as you say what I have is likely good enough. – ttsesm Oct 20 '20 at 12:57

1 Answers1

1

OP offers a solution:

np.concatenate(np.split(table, table.shape[1]/n, axis=1), axis=0)

It appears to be inefficient because np.split forces to change data to list of arrays and then iterate it within outer argument. More over np.concatenate is not that efficient as well. They are ideal to work when lengths of items of list are not balanced.

My solution is like so:

np.vstack(table.reshape(4,3,2).swapaxes(0,1))
#np.vstack(table.reshape(4,2,3).swapaxes(0,1)) #for second case

Let's check if my prediction of performance was correct:

%timeit np.vstack(table.reshape(4,3,2).swapaxes(0,1))
%timeit np.concatenate(np.split(table, table.shape[1]/2, axis=1), axis=0)

Output:

22.4 µs ± 2.72 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)
42.8 µs ± 9.09 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

I was looking for solution with an assist of numpyviz (Disclaimer: I'm its author). There is a numpyvizualisation that helps to understand a process better:

enter image description here

In general you need to set a shape so that blocks preserves width and height. Numpyvizualisation hints that axis=0 and axis=2 of new shape corresponds to table.shape[0] and n respectively. Also the lenght of middle axis is a number of blocks which is table.shape[1]\\n. So you can pass arguments like so:

table.reshape(table.shape[0], table.shape[1]\\n, n)

or even in a more simple way:

table.reshape(table.shape[0], -1, n)
mathfux
  • 5,759
  • 1
  • 14
  • 34
  • Interesting approach, thanks @mathfux. One question though, how do you specify the first two dimensions in the reshape if I consider that my desired split `n` value is always my 3rd dim, e.g. `reshape(x, y, n)` and `reshape(x, y, n)` respectively for each case scenario? Btw the `numpyviz` seems quite cool, nice work. – ttsesm Oct 20 '20 at 14:26
  • @ttsesm I've noticed a typo in a name of third figure, hoing to correct and add some update. Arguments of `reshape(x, y, n)` corresponds to lengths of `axis0`, `axis1` and `axis2` in my diagrams. – mathfux Oct 20 '20 at 14:32
  • so this means that they could be replaced with `reshape(table.shape[0], -1, n)`? I am asking so that to make it possible to generalize. – ttsesm Oct 20 '20 at 14:59
  • 1
    @ttsesm You're right. This is the same that I've posted in my updated answer. – mathfux Oct 20 '20 at 15:03