As the title of the question suggests, I am trying to find an optimal (and possibly pythonic) way of splitting several times a one dimensional numpy array into several irregular fragments, provided the following conditions: the first split occurs into n
fragments whose lengths l
are contained in the LSHAPE
array, the second split occurs in each one of the n
previous fragments, but now each one of them is split regularly into m
arrays. The corresponding values of m
are stored in the MSHAPES
array, in a way that the i-th m
matches the i-th l
. To best illustrate my problem, I include the solution I have found so far, which makes use of the numpy split method:
import numpy as np
# Define arrays (n = 3 in this example)
LSHAPE = np.array([5, 8, 3])
MSHAPE = np.array([4, 5, 2])
# Generate a random 1D array of the requiered lenght
LM_SHAP = np.sum(np.multiply(LSHAPE, MSHAPE))
REFDAT = np.random.uniform(-1, 1, size=LM_SHAP)
# Split twice the array (this is my solution so far)
SLICE_L = np.split(REFDAT, np.cumsum(np.multiply(LSHAPE, MSHAPE)))[0:-1]
SLICE_L_M = []
for idx, mfrags in enumerate(SLICE_L):
SLICE_L_M.append(np.split(mfrags, MSHAPE[idx]))
In the code above a random test array (REFDAT
) is created to fulfill the requirements of the problem, and then subsequently split. The results are stored in the SLICE_L_M
array. This solution works, but I think is hard to read and possibly not efficient, so I would like to know if it is possible to improve it. I have read some Stackoverflow threads which are related to this one (like this one and this one) but I think my problem is slightly different. Thanks in advance for your help and time.
Edit:
One can gain an average ~ 3% CPU time improvement if a list comprehension is used:
SLICE_L = np.split(REFDAT, np.cumsum(np.multiply(LSHAPE, MSHAPE)))[0:-1]
SLICE_L_M = [np.split(lval, mval) for lval, mval in zip(SLICE_L, MSHAPE)]