1

i hope someone can help me.

I have a numpy array with 5 dimensions:

my_array = {ndarray: (256,256,256,4,3,3)}

I want to sort it by the last dimension(4), leaving intact the 3x3 blocks. Put differently, i want to sort a lot of 3x3 blocks, where 4 of them always build a group.

On a small scale example, suppose i have a similar array

my_array = {ndarray: (256,256,256,4,2,2)}

for every group of the 256*256*256 groups that can look like this:

[[[2,3],[1,3]],
[[1,2],[3,2]],
[[1,4],[2,1]],
[[1,2],[3,4]]]

i want the blocks to be sorted like this:

[[[1,2],[3,2]],
[[1,2],[3,4]],
[[1,4],[2,1]],
[[2,3],[1,3]]]

For the simple case of a 2d array I was able to achieve this (sort columns and keeping columns intact) by using my_2darray[:,np.lexsort(my_2darray)]

I tried using np.sort(my_array, axis=3) which led to the individual values being sorted, not the blocks, I tried all variations in the style of my_array[:,np.lexsort(my_array)] and similar, and I find nothing that works. On a sidenote, I found out that the axis I want to sort by with lexsort needs to be last, otherwise it behaves weirdly. No problem, did np.swapaxes, but still couldn't make it work in the highdimensional example. Does someone have some helpful insight?

Thank you!

duckduckno
  • 11
  • 2
  • ```lexsort``` when an axis is not specified: by default sorts by the last axis. – TiredButAwake Feb 22 '23 at 13:06
  • The answer is probably buried somewhere in there: [Sorting a multi-dimensional numpy array?](https://stackoverflow.com/questions/55748262/sorting-a-multi-dimensional-numpy-array) – Stef Feb 22 '23 at 13:21
  • Explore using `argsort`. But how do you order the (3,3) blocks? What makes one "bigger" than another? – hpaulj Feb 22 '23 at 17:50
  • 1
    @TiredButAwake somehow, everytime I used lexsort in this scenario it ignored the specified axis and sorted by the last. – duckduckno Feb 23 '23 at 15:34
  • @hpaulj I want to sort the blocks so that they are ordered first by first row first column, first row second column and so on as I illustrated in the example – duckduckno Feb 23 '23 at 15:38

1 Answers1

0

Technically you can use this solution, but it may be a bit complicated to apply to 5-dimensions, so here is the implementation. Please verify for yourself before using it.

# Create a 5-dimensional array as input.
np.random.seed(0)
a = np.random.randint(0, 10, size=(2, 2, 3, 2, 2))
print("a:", a.shape)  # (2, 2, 3, 2, 2)
print(a)
# [[[
#     [[5, 0], [3, 3]],
#     [[7, 9], [3, 5]],
#     [[2, 4], [7, 6]],
# ...

# Flatten all axes except the axis you want to sort on.
# That is, make a 3-dimensional array of (N, sort-axis, M).
b = a.reshape([-1, a.shape[-3], a.shape[-2] * a.shape[-1]])
print("b:", b.shape)  # (4, 3, 4)
print(b)
# [[
#     [5, 0, 3, 3],
#     [7, 9, 3, 5],
#     [2, 4, 7, 6],
# ...

# Then, use lexsort with the bottom axis as sort keys.
idx = np.lexsort([b[..., i] for i in range(b.shape[-1])][::-1])
idx = np.lexsort(np.rollaxis(b, -1)[::-1])  # This is the same as above, but faster.
print("idx:", idx.shape)  # (4, 3)
print(idx)
# [
#     [2, 0, 1],
# ...

# The idx above are the sort order for each block. We can use it like this:
c = np.array([b[i][idx[i]] for i in range(len(b))])
c = b[np.arange(len(b))[:, np.newaxis], idx]  # This is the same as above, but faster.
print("c:", c.shape)  # (4, 3, 4)
print(c)
# [[
#     [2, 4, 7, 6],
#     [5, 0, 3, 3],
#     [7, 9, 3, 5],
# ...

# Restore to the original shape.
d = c.reshape(a.shape)
print("d:", d.shape)  # (2, 2, 3, 2, 2)
print(d)
# [[[
#     [[2, 4], [7, 6]],
#     [[5, 0], [3, 3]],
#     [[7, 9], [3, 5]],
# ...
ken
  • 1,543
  • 1
  • 2
  • 14
  • Thank you so much! This works really well. And it's fast! :D My original solution, looping though the first 3 dimensions, would've taken around 30mins, a solution where I simply flattened the first 3 and last 2 dimensions and used lexsort was 4mins, this solution now is 20s on my machine. Thank you! – duckduckno Feb 23 '23 at 15:31