0

As the title of the question suggests, I am trying to find an optimal (and possibly pythonic) way of splitting several times a one dimensional numpy array into several irregular fragments, provided the following conditions: the first split occurs into n fragments whose lengths l are contained in the LSHAPE array, the second split occurs in each one of the n previous fragments, but now each one of them is split regularly into m arrays. The corresponding values of m are stored in the MSHAPES array, in a way that the i-th m matches the i-th l. To best illustrate my problem, I include the solution I have found so far, which makes use of the numpy split method:

import numpy as np

# Define arrays (n = 3 in this example)

LSHAPE = np.array([5, 8, 3])
MSHAPE = np.array([4, 5, 2])

# Generate a random 1D array of the requiered lenght

LM_SHAP = np.sum(np.multiply(LSHAPE, MSHAPE))
REFDAT = np.random.uniform(-1, 1, size=LM_SHAP)

# Split twice the array (this is my solution so far)

SLICE_L = np.split(REFDAT, np.cumsum(np.multiply(LSHAPE, MSHAPE)))[0:-1]
SLICE_L_M = []
for idx, mfrags in enumerate(SLICE_L):
    SLICE_L_M.append(np.split(mfrags, MSHAPE[idx]))

In the code above a random test array (REFDAT) is created to fulfill the requirements of the problem, and then subsequently split. The results are stored in the SLICE_L_M array. This solution works, but I think is hard to read and possibly not efficient, so I would like to know if it is possible to improve it. I have read some Stackoverflow threads which are related to this one (like this one and this one) but I think my problem is slightly different. Thanks in advance for your help and time.

Edit:

One can gain an average ~ 3% CPU time improvement if a list comprehension is used:

SLICE_L = np.split(REFDAT, np.cumsum(np.multiply(LSHAPE, MSHAPE)))[0:-1]
SLICE_L_M = [np.split(lval, mval) for lval, mval in zip(SLICE_L, MSHAPE)]
panadestein
  • 1,241
  • 10
  • 21
  • 1
    `np.split` via `np.array_split` does a `alist.append(arr[st:end])` in a loop. There's some added complexity because it can split on different axes, but at its core it's just a repeated slicing. As long as the fragments are irregular you can't do anything more efficient. – hpaulj Jun 08 '20 at 16:11
  • Thanks a lot for the explanation. Still, the code I wrote looks very cumbersome, not obvious. Maybe this is all one can do, but I am curious if a more clever/simpler implementation exists. – panadestein Jun 08 '20 at 19:24

0 Answers0