0

I'm looking for the most computationally efficient way to reshape/restack a large 2D numpy array into a 3D structure. The general problem is as follows:

enter image description here

  • Consider a (large) array with dimensions m (rows) x 3n (columns). I want to 'cut' every n set of columns and place each slice across the third dimension. This could be the case, e.g., when you have multiple realisations of variables v_1, v_2, v_3, ..., v_n across multiple time-steps and they are stiched together. The problem could be generalized by assuming a 'cut'/slice every k columns (user-defined), or across an axis (row-wise, or column-wise, so you could account for hstacked or vstacked 2D arrays).

Is there a simple way to do this, e.g. using numpy or xarray? I've looked into some answers, which either feature np.concatenate or np.dstack, but couldn't make a neat simple function out of it.

Simple toy data for this problem using iPython:

import numpy as np
import pandas as pd
a = np.random.normal(size=12).reshape(4,3) # loc = 0 , scale = 1
b = np.random.normal(loc=10, scale=1, size=12).reshape(4,3)
c = np.random.normal(loc=100, scale=1, size=12).reshape(4,3)

data = np.concatenate((a,b,c), axis=1)
data_table = pd.DataFrame(data) #just to visualize data
display(data_table)

enter image description here

The expected outcome of a well-constructed 3D array:

print(out[:,:,1])

[[10.21199864  9.08363379 10.55152419]
 [10.14010741  8.81101692  9.96865292]
 [10.20170553  9.89587058  9.42856842]
 [10.11030506 10.32314315 10.2727249 ]]

print(out[0,0,:]) #"carrot" slice,the [0,0] element across the third dimension

[-9.81457189e-02  1.02119986e+01  9.94245724e+01]
dbouz
  • 779
  • 9
  • 14
  • The proposed solutions in the duplicate topic don't produce the desired result. Instead, I've found another solution using `np.split()` and `np.dstack()`. Still curious if it can be done with `xarray` though. – dbouz Dec 23 '19 at 15:28
  • The accepted answer talks about producing a 3D array output at the end of it. Doesn't that answer your question? If it doesn't, could you share the expected output for a minimal sample data? – Divakar Dec 23 '19 at 16:02
  • @Divakar added some expected output – dbouz Dec 23 '19 at 16:39
  • First off the linked post expects array input. So, to use the linked answer(s), you need to use the array data, which would be `data_table.values`. Are you using that one? From posted error msg, it seems you are not using array data. Secondly, for a reproducible sample data, use a constant see param, something like `np.random.seed(0)` before using `np.random.normal` to generate sample data. – Divakar Dec 23 '19 at 16:44
  • `np.stack([a,b,c], axis=2)`. Or it the blocks have already been concatenated, reshape and transpose `data.reshape(4,n,3).transpose(0,2,1)` (to move the `n` dimension to the end). – hpaulj Dec 23 '19 at 18:06
  • 1
    Duplicate of [`Flatten or group array in blocks of columns - NumPy / Python `](https://stackoverflow.com/questions/58101239). – Divakar Dec 25 '19 at 06:31

0 Answers0