6

I have an object occupying an underlying N-dimensional square grid (represented by a numpy array) so that only 25% of the grid points are occupied. Each 1x1x1x... N-cube (i.e., hypercube) in this grid contains the same amount of this object (located only at some of the vertices of this hypercube). I have an array of the coordinates of all the occupied grid points. The task is to cycle through this array and extract the occupied coordinates for each 1x1x1... hypercube and store them in a new array for further processing.

The situation is best explained by example. Consider the 3-D case where the underlying grid has been chosen so that 1<=i,j,k<=4 giving the 2d numpy array: A = [ [1 1 1] [1 2 1] [1 3 1] [1 4 1] [2 1 1] [2 2 1] [2 3 1] [2 4 1] [3 1 1] [3 2 1] [3 3 1] [3 4 1] [4 1 1] [4 2 1] [4 3 1] [4 4 1] [1 1 2] [1 2 2] [1 3 2] [1 4 2] [2 1 2] [2 2 2] [2 3 2] [2 4 2] [3 1 2] [3 2 2] [3 3 2] [3 4 2] [4 1 2] [4 2 2] [4 3 2] [4 4 2] [1 1 3] [1 2 3] [1 3 3] [1 4 3] [2 1 3] [2 2 3] [2 3 3] [2 4 3] [3 1 3] [3 2 3] [3 3 3] [3 4 3] [4 1 3] [4 2 3] [4 3 3] [4 4 3] [1 1 4] [1 2 4] [1 3 4] [1 4 4] [2 1 4] [2 2 4] [2 3 4] [2 4 4] [3 1 4] [3 2 4] [3 3 4] [3 4 4] [4 1 4] [4 2 4] [4 3 4] [4 4 4] ]

An example of the 2d numpy array that I need to process in this case is: B = [[1,1,1],[1,2,1],[1,3,1],[1,4,1],[2,2,1],[2,3,1],[2,4,1],[3,2,1],[3,3,1],[3,4,1],[4,1,1],[4,3,1],[1,1,2],[1,4,2],[2,1,2],[2,2,2],[2,4,2],[3,1,2],[3,2,2],[3,4,2],[4,1,2],[4,2,2],[4,3,2],[4,4,2],[1,1,3],[1,4,3],[2,1,3],[2,2,3],[2,3,3],[2,4,3],[3,1,3],[3,2,3],[3,3,3],[4,1,3],[4,2,3],[4,3,3],[4,4,3],[1,2,4],[1,3,4],[1,4,4],[2,1,4],[2,2,4],[2,3,4],[3,1,4],[3,2,4],[3,3,4],[4,3,4],[4,4,4]]

B is the array of the occupied grid points only. Cycling through B, I would like to get the occupied coordinates pertaining to each of the cubes of the underlying grid. The cube edges are defined by 1<=i,j,k<=2, (3<=i<=4) & (1<=j,k<=2), etc. For instance, for the cube demarcated by 1<=i,j,k<=2, I would like to extract the array [[1,1,1],[1,1,2],[1,2,1],[2,1,2],[2,2,1],[2,2,2]] from B and store it in a new array for further processing. This is then repeated until all the 8 cubes in this example are accounted for. Note that the grids are always chosen to be even so that this type of partitioning is always possible. While I can choose the grids to be even, I have no control over the site occupations by the object.

Numpy array slicing doesn't work because, in general, the grid points are not contiguous in B. I attempted a code using "for" loops to impose the ranges for the edges of the cubes but it got too complicated quickly and it doesn't seem like the right approach. Then there is the "numpy.where" function; but the complexity of the condition makes it rather intractable. Similar challenges arise with "numpy.extract".

Davis Herring
  • 36,443
  • 4
  • 48
  • 76
ooter
  • 63
  • 3
  • So your input is always a *complete* lattice in some product of intervals? If so, you don’t need any more than the bounds of those intervals to generate all the small arrays… – Davis Herring Apr 20 '19 at 02:10
  • @DavisHerring Sorry Davis, the original question wasn't very clear. I have now revised it extensively and so I hope it is now clearer what I am trying to do. – ooter Apr 20 '19 at 13:11
  • Apparently your “query” cubes are disjoint—no `2<=i,j,k<=3` in your example, for instance. If I’ve guessed correctly, you should say that. – Davis Herring Apr 20 '19 at 15:09
  • It’s clearer, but `A` is still relevant only in its extents (in case `B` doesn’t touch every face of the bounding box). – Davis Herring Apr 20 '19 at 15:14
  • @DavisHerring yes, they're intentionally disjoint so that each vertex only belongs to one cube. – ooter Apr 20 '19 at 15:19

1 Answers1

0

If you have the grid as an ndim+1-dimensional array of coordinates like so

a = np.stack(np.mgrid[1:5, 1:5], -1)

then it is just a matter of judicious reshaping and transposing:

import itertools as it

# dimensions for reshape:
# split all coordinate axes into groups of two, leave the
# "dimension subscript" axis alone     
chop = *it.chain.from_iterable((i//2, 2) for i in a.shape[:-1]),

# shuffle for transpose:
# regroup axes into coordinates then pairs then subscript
ax_shuf = *(1 ^ np.r_[1:4*a.ndim-3:2] % (2*a.ndim-1)), -1

# put it together and flatten, i.e. join coordinates and join pairs
a.reshape(*chop, -1).transpose(*ax_shuf).reshape(np.prod(chop[::2]), -1, a.shape[-1])

Result:

array([[[1, 1],
        [1, 2],
        [2, 1],
        [2, 2]],

       [[1, 3],
        [1, 4],
        [2, 3],
        [2, 4]],

       [[3, 1],
        [3, 2],
        [4, 1],
        [4, 2]],

       [[3, 3],
        [3, 4],
        [4, 3],
        [4, 4]]])

Update

If B is an unordered subset of the full grid you can do the following trick:

  1. subtract the columnwise min to make sure parity starts at even
  2. floor divide by 2, the corners of each cube of interest will collapse into a single point
  3. now use np.unique on the rows to get the goups

Note: If need be you can make this a bit faster by replacing argsort in the code below with one of the solutions at Most efficient way to sort an array into bins specified by an index array?.

B = np.array(B)
unq, idx, cts = np.unique((B-B.min(0))//2, axis=0, return_inverse=True, return_counts=True)
if (cts[0]==cts).all():
    result = B[idx.argsort()].reshape(len(unq), cts[0], -1)
else:
    result = np.split(B[idx.argsort()], cts[:-1].cumsum())

Result:

array([[[1, 1, 1],
        [1, 2, 1],
        [2, 2, 1],
        [2, 2, 2],
        [2, 1, 2],
        [1, 1, 2]],

       [[2, 2, 3],
        [2, 1, 3],
        [1, 1, 3],
        [2, 1, 4],
        [2, 2, 4],
        [1, 2, 4]],

       [[1, 4, 2],
        [2, 4, 2],
        [1, 3, 1],
        [1, 4, 1],
        [2, 3, 1],
        [2, 4, 1]],

       [[1, 4, 4],
        [1, 3, 4],
        [2, 4, 3],
        [2, 3, 3],
        [1, 4, 3],
        [2, 3, 4]],

       [[4, 1, 1],
        [4, 1, 2],
        [3, 2, 1],
        [3, 2, 2],
        [4, 2, 2],
        [3, 1, 2]],

       [[4, 2, 3],
        [3, 2, 4],
        [3, 1, 3],
        [3, 2, 3],
        [3, 1, 4],
        [4, 1, 3]],

       [[4, 4, 2],
        [4, 3, 2],
        [3, 4, 2],
        [4, 3, 1],
        [3, 4, 1],
        [3, 3, 1]],

       [[4, 3, 3],
        [3, 3, 3],
        [4, 3, 4],
        [3, 3, 4],
        [4, 4, 3],
        [4, 4, 4]]])
Paul Panzer
  • 51,835
  • 3
  • 54
  • 99
  • Sorry Paul, it appears that I may have oversimplified the problem leading to a misunderstanding. The grid I described is populated by some object for only 25% of the points. In fact, array A is the array of these points. Each 1x1x1x... N-cube of the grid contains the same amount of this object sitting at the vertices (i.e., on the grid points). The task is to extract the coordinates of the object from array A for each hypercube and store them in another array for processing. I can only choose the grids to be even but I have no control over the site occupations. – ooter Apr 20 '19 at 07:14
  • @ooter no worries. But could you please better describe your problem? Or even better, update your example to better describe your actual task? As it stands it does look like a flattened meshgrid to me; if that is not the case choose a more representative example. And above all add an expected result. Thanks. – Paul Panzer Apr 20 '19 at 07:41
  • I have completely updated the question. Hope the task is clearer now! – ooter Apr 20 '19 at 13:10
  • @ooter I've updated the answer. It still makes the assumption that coordinates are consecutive integers. If that is not correct it gets a bit more complicated but still very doable. – Paul Panzer Apr 20 '19 at 13:37
  • Thanks; this looks like it's heading in the right direction but what is "srt" in: result = B[srt].reshape(len(unq), cts[0], -1) – ooter Apr 20 '19 at 15:33
  • @ooter Oops! Fixed. – Paul Panzer Apr 20 '19 at 23:21
  • 1
    Sorry I went away before I got a chance to test the code. It does the magic! Thanks for the help - still can't believe how quickly you figured this out! – ooter Apr 30 '19 at 15:54