45

Given a self-indexing (not sure if this is the correct term) numpy array, for example:

a = np.array([3, 2, 0, 1])

This represents this permutation (=> is an arrow):

0 => 3
1 => 2
2 => 0
3 => 1

I'm trying to make an array representing the inverse transformation without doing it "manually" in python, that is, I want a pure numpy solution. The result I want in the above case is:

array([2, 3, 1, 0])

Which is equivalent to

0 <= 3                0 => 2
1 <= 2       or       1 => 3
2 <= 0                2 => 1
3 <= 1                3 => 0

It seems so simple, but I just can't think of how to do it. I've tried googling, but haven't found anything relevant.

Ali
  • 56,466
  • 29
  • 168
  • 265
Lauritz V. Thaulow
  • 49,139
  • 12
  • 73
  • 92

3 Answers3

60

Short answer

def invert_permutation(p):
    """Return an array s with which np.array_equal(arr[p][s], arr) is True.
    The array_like argument p must be some permutation of 0, 1, ..., len(p)-1.
    """
    p = np.asanyarray(p) # in case p is a tuple, etc.
    s = np.empty_like(p)
    s[p] = np.arange(p.size)
    return s


Sorting is an overkill here. This is just a single-pass, linear time algorithm with constant memory requirement:

from __future__ import print_function
import numpy as np

p = np.array([3, 2, 0, 1])
s = np.empty(p.size, dtype=np.int32)
for i in np.arange(p.size):
    s[p[i]] = i

print('s =', s)

The above code prints

 s = [2 3 1 0]

as required.

The rest of the answer is concerned with the efficient vectorization of the above for loop. If you just want to know the conclusion, jump to the end of this answer.


(The original answer from Aug 27, 2014; the timings are valid for NumPy 1.8. An update with NumPy 1.11 follows later.)

A single-pass, linear time algorithm is expected to be faster than np.argsort; interestingly, the trivial vectorization (s[p] = xrange(p.size), see index arrays) of the above for loop is actually slightly slower than np.argsort as long as p.size < 700 000 (well, on my machine, your mileage will vary):

import numpy as np

def np_argsort(p):
    return np.argsort(p)
    
def np_fancy(p):
    s = np.zeros(p.size, p.dtype) # np.zeros is better than np.empty here, at least on Linux
    s[p] = xrange(p.size) 
    return s

def create_input(n):
    np.random.seed(31)
    indices = np.arange(n, dtype = np.int32)
    return np.random.permutation(indices)

From my IPython notebook:

p = create_input(700000)
%timeit np_argsort(p)
10 loops, best of 3: 72.7 ms per loop
%timeit np_fancy(p)
10 loops, best of 3: 70.2 ms per loop

Eventually the asymptotic complexity kicks in (O(n log n) for argsort vs. O(n) for the single-pass algorithm) and the single-pass algorithm will be consistently faster after a sufficiently large n = p.size (threshold is around 700k on my machine).

However, there is a less straightforward way to vectorize the above for loop with np.put:

def np_put(p):
    n = p.size
    s = np.zeros(n, dtype = np.int32)
    i = np.arange(n, dtype = np.int32)
    np.put(s, p, i) # s[p[i]] = i 
    return s

Which gives for n = 700 000 (the same size as above):

p = create_input(700000)
%timeit np_put(p)
100 loops, best of 3: 12.8 ms per loop

This is a nice 5.6x speed up for next to nothing!

To be fair, np.argsort still beats the np.put approach for smaller n (the tipping point is around n = 1210 on my machine):

p = create_input(1210)
%timeit np_argsort(p)
10000 loops, best of 3: 25.1 µs per loop
%timeit np_fancy(p)
10000 loops, best of 3: 118 µs per loop
%timeit np_put(p)
10000 loops, best of 3: 25 µs per loop

This is most likely because we allocate and fill in an extra array (at the np.arange() call) with the np_put approach.


Although you didn't ask for a Cython solution, just out of curiosity, I also timed the following Cython solution with typed memoryviews:

import numpy as np
cimport numpy as np

def in_cython(np.ndarray[np.int32_t] p):    
    cdef int i
    cdef int[:] pmv
    cdef int[:] smv 
    pmv = p
    s = np.empty(p.size, dtype=np.int32)
    smv = s
    for i in xrange(p.size):
        smv[pmv[i]] = i
    return s

Timings:

p = create_input(700000)
%timeit in_cython(p)
100 loops, best of 3: 2.59 ms per loop

So, the np.put solution is still not as fast as possible (ran 12.8 ms for this input size; argsort took 72.7 ms).


Update on Feb 3, 2017 with NumPy 1.11

Jamie, Andris and Paul pointed out in comments below that the performance issue with fancy indexing was resolved. Jamie says it was already resolved in NumPy 1.9. I tested it with Python 3.5 and NumPy 1.11 on the machine that I was using back in 2014.

def invert_permutation(p):
    s = np.empty(p.size, p.dtype)
    s[p] = np.arange(p.size)
    return s

Timings:

p = create_input(880)
%timeit np_argsort(p)
100000 loops, best of 3: 11.6 µs per loop
%timeit invert_permutation(p)
100000 loops, best of 3: 11.5 µs per loop

A significant improvement indeed!


Conclusion

All in all, I would go with the Short answer approach mentioned at the top for code clarity. In my opinion, it is less obscure than argsort, and also faster for large input sizes. If speed becomes an issue, I would go with the Cython solution.

Ali
  • 56,466
  • 29
  • 168
  • 265
  • Nice! I didn't know `np.put`, fills an important niche between my slow solution and Cython. +1. – Fred Foo Aug 27 '14 at 22:25
  • +1 Great minds think alike! ;-) About the same time you were writing this answer to a two year old question, I was sending a PR to use a technique very similar to this in numpy's `unique` function, see [here](https://github.com/numpy/numpy/pull/5012). FWIW, the advantage of `np.put` over fancy indexing is mostly gone in numpy 1.9.. – Jaime Sep 21 '14 at 16:26
  • @Jaime Thanks for the good news! I find fancy indexing the cleanest solution; it is a shame that it used to be so slow. Good to know that it is mostly gone in NumPy 1.9. – Ali Sep 21 '14 at 17:27
  • 1
    s[p]=np.arange(p.size) is even less obscure, and works twice as fast as np.put on my machine (I know, I know). – Andris Birkmanis Oct 25 '16 at 18:29
  • As a follow up to Andris Birkmanis, I found this helpful: https://arogozhnikov.github.io/2015/09/29/NumpyTipsAndTricks1.html#Even-faster-inverse-permutation – Paul Feb 03 '17 at 17:19
  • 1
    @Paul Thanks for the info! Indeed, apparently since NumPy 1.9, there is no point in using np.put(). I will update my answer accordingly very soon! – Ali Feb 03 '17 at 21:32
  • @AndrisBirkmanis Sorry, apparently I missed your comment somehow in 2016. I am very sorry! As other people also pointed out you are right. I updated the answer accordingly. Thanks for your feedback! – Ali Feb 03 '17 at 22:25
  • The first line can be simplified to `s = np.empty_like(p)`. – nth Apr 17 '20 at 14:49
  • @MateenUlhaq I don't like your changes. Why do you need `np.asanyarray()`? I think you are just making an unnecessary copy of the input if it is not an `np.array`. The `np.empty_like()` can handle the argument even it if is not an `np.array`. Please do not edit the answer any further, but please explain it in the comments. – Ali Apr 14 '22 at 06:39
  • I felt as if tuples were a common representation of permutations, so I added that. (If `p` is a tuple, `p.size` fails.) Most standard numpy library functions have an `np.asanyarray` to ensure they work for non-numpy arrays. Of course, the caller could always do the conversion themselves. – Mateen Ulhaq Apr 14 '22 at 06:49
  • @MateenUlhaq As I said, `np.empty_like()` handles that. Double-checked with `tuple`, so the `np.asanyarray` is indeed unnecessary. Please do not edit the answer yet. The other thing I don't like is the docstring. The function name is `invert_permutation` and the docstring says: `Invert permutation.` It is just silly, and it is not helping anybody. The documentation should not parrot the code. Please suggest some improvement here in the comments. – Ali Apr 14 '22 at 07:14
  • @MateenUlhaq Ah, I see, the `p.size` later. OK. I will figure something out. In any case, I don't like making an unnecessary copy. – Ali Apr 14 '22 at 07:16
  • 1
    @MateenUlhaq `len(p)` instead of `p.size`? What do you think? – Ali Apr 14 '22 at 07:19
  • `len(p)` is good. Regarding the docstring, [ideally](https://peps.python.org/pep-0257/#multi-line-docstrings), `Invert permutation.` should be replaced with a useful summary line, e.g. `Return a permutation that can restore an an array permuted by p.`, but removing it is probably fine. – Mateen Ulhaq Apr 14 '22 at 07:33
  • @MateenUlhaq The real problem isn't the `p.size` which we could fix with `len(p)` but the fancy indexing on the next line `s[p] = ...` which fails if `p` is a tuple. So I left the `np.asanyarray` there which essentially does not do anything if `p` is an `ndarray` and does the right thing otherwise. Please leave this answer now as it is. – Ali Apr 15 '22 at 08:22
41

The inverse of a permutation p of np.arange(n) is the array of indices s that sort p, i.e.

p[s] == np.arange(n)

must be all true. Such an s is exactly what np.argsort returns:

>>> p = np.array([3, 2, 0, 1])
>>> np.argsort(p)
array([2, 3, 1, 0])
>>> p[np.argsort(p)]
array([0, 1, 2, 3])
Fred Foo
  • 355,277
  • 75
  • 744
  • 836
11

I'd like to offer a tiny bit more background to larsmans correct answer. The reason why argsort is correct can be found when you use the representation of a permutation by a matrix. The mathematical advantage to a permutation matrix P is that the matrix "operates on vectors", i.e. a permutation matrix times a vector permutes the vector.

Your permutation looks like:

import numpy as np
a   = np.array([3,2,0,1])
N   = a.size
rows = np.arange(N)
P   = np.zeros((N,N),dtype=int)
P[rows,a] = 1

[[0 0 0 1]
 [0 0 1 0]
 [1 0 0 0]
 [0 1 0 0]]

Given a permutation matrix, we can "undo" multipication by multiplying by it's inverse P^-1. The beauty of permutation matrices is that they are orthogonal, hence P*P^(-1)=I, or in other words P(-1)=P^T, the inverse is the transpose. This means we can take the indices of the transpose matrix to find your inverted permutation vector:

inv_a = np.where(P.T)[1]
[2 3 1 0]

Which if you think about it, is exactly the same as finding the indices that sort the columns of P!

Hooked
  • 84,485
  • 43
  • 192
  • 261