1

I have the following 3d array:

import numpy as np

z = np.array([[[10,  2],
               [ 5,  3],
               [ 4,  4]],
              [[ 7,  6],
               [ 4,  2],
               [ 5,  8]]])

I want to sort them according to 3rd dim & 1st value.

Currently I am using following code:

from operator import itemgetter

np.array([sorted(x,key=itemgetter(0)) for x in z])
array([[[ 4,  4],
        [ 5,  3],
        [10,  2]],

       [[ 4,  2],
        [ 5,  8],
        [ 7,  6]]])

I would like to make the code more efficient/faster by removing the for loop?

FBruzzesi
  • 6,385
  • 3
  • 15
  • 37
john
  • 13
  • 3

3 Answers3

1

You can use map() to achieve the same result without a for-loop. And with the sort function being either user-defined, or a lambda, or a partial of sorted:

  1. By first creating a sort function:

    >>> def mysort(it):
    ...   return sorted(it, key=itemgetter(0))
    ...
    >>> list(map(mysort, z))
    [[[4, 4], [5, 3], [10, 2]], [[4, 2], [5, 8], [7, 6]]]
    
  2. Same as above, but with a lambda instead:

    >>> list(map(lambda it: sorted(it, key=itemgetter(0)), z))
    [[[4, 4], [5, 3], [10, 2]], [[4, 2], [5, 8], [7, 6]]]
    
  3. With a partial:

    >>> from functools import partial
    >>> psort = partial(sorted, key=itemgetter(0))
    >>> list(map(psort, z))
    [[[4, 4], [5, 3], [10, 2]], [[4, 2], [5, 8], [7, 6]]]
    

    Or the partial defined in-place:

    >>> list(map(partial(sorted, key=itemgetter(0)), z))
    [[[4, 4], [5, 3], [10, 2]], [[4, 2], [5, 8], [7, 6]]]
    
  4. Your question has a list of lists of lists, rather than a 3d numpy array. For numpy-oriented solutions, see this answer.

FYI, (2) and (3b) are roughly equivalent, but have their differences.
Among options 1-3, my preference is the lambda in (2).

aneroid
  • 12,983
  • 3
  • 36
  • 66
  • Thank you for your suggestions. I tested them all and it seems the for loop is still the quickest. Does that sound correct to you? – john May 01 '20 at 10:48
  • Are you checking each against a much much longer version of the original list `z`? For such short lists, I don't think the performance difference would be clear. – aneroid May 01 '20 at 10:58
1

For a numpy one liner you can use numpy.argsort:

import numpy as np

a = np.array([[[10,  2],
               [ 5,  3],
               [ 4,  4]],
              [[ 7,  6],
               [ 4,  2],
               [ 5,  8]]])

a[np.arange(0,2)[:,None], a[:,:,0].argsort()]
array([[[ 4,  4],
        [ 5,  3],
        [10,  2]],
       [[ 4,  2],
        [ 5,  8],
        [ 7,  6]]])

Which for such small size array takes about the same time, yet scaling up the size will result in quite an improvement, for instance:

from operator import itemgetter

a = np.random.randint(0,10, (2,100_000,2))

%timeit a[np.arange(0,2)[:,None], a[:,:,0].argsort()]
26.9 ms ± 351 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit [sorted(x,key=itemgetter(0)) for x in a]
327 ms ± 6.39 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
FBruzzesi
  • 6,385
  • 3
  • 15
  • 37
0

Why not simply : np.sort(z,axis=1) ?

import numpy as np

z = np.array([[[10,  2],
               [ 5,  3],
               [ 4,  4]],
              [[ 7,  6],
               [ 4,  2],
               [ 5,  8]]])

print(np.sort(z,axis=1))

[[[ 4  2]
  [ 5  3]
  [10  4]]

 [[ 4  2]
  [ 5  6]
  [ 7  8]]]
Alain T.
  • 40,517
  • 4
  • 31
  • 51