Given a "jumbled" list L, get a list where each element is the index of the corresponding element of L, if L was sorted

Question

Desired Output

I want a function to return a list such that, given a "jumbled" list l, each element is the index of the corresponding element of l, if l was sorted. (I'm failing to think of a less convoluted way of saying this, sorry.)

Examples

f([3,1,2]) = [2,0,1]

f([3,1,2,2,3]) = [3,0,1,2,4], since the input sorted is [1,2,2,3,3].

(This is useful for some stats calculations.)

My Attempt

I came up with a way to do this function, but this is python- it seems like there should be a one-liner to do this, or at least a much cleaner, clearer way.

def getIndiciesInSorted(l):
    sortedL = sorted(l)
    outputList = []
    for num in l:
        sortedIndex = sortedL.index(num)
        outputList.append(sortedIndex)
        sortedL[sortedIndex] = None
    return outputList

l=[3,1,2,2,3] 
print getIndiciesInSorted(l)

So, how can I write this more concisely? Is there a legible list comprehension solution?

Note that this is not the same as `sorted(range(len(l)), key=lambda i: l[i])`... (Though that was my first thought.) — Matthew Adams, Sep 14 '12 at 00:18
@nightcracker I know, your answer was my first attempt as well, so I thought I'd leave a comment to let other people know that that method was a no-go. — Matthew Adams, Sep 14 '12 at 00:22

unutbu · Accepted Answer · 2012-09-14T01:14:55.120

def argsort(seq):
    # http://stackoverflow.com/questions/3382352/3382369#3382369
    # http://stackoverflow.com/questions/3071415/3071441#3071441
    '''
    >>> seq=[1,3,0,4,2]
    >>> index=argsort(seq)
    [2, 0, 4, 1, 3]

    Given seq and the index, you can construct the sorted seq:
    >>> sorted_seq=[seq[x] for x in index]
    >>> assert sorted_seq == sorted(seq)

    Given the sorted seq and the index, you can reconstruct seq:
    >>> assert [sorted_seq[x] for x in argsort(index)] == seq
    '''
    return sorted(range(len(seq)), key=seq.__getitem__)

def f(seq):
    idx = argsort(seq)
    return argsort(idx)

print(f([3,1,2]))
# [2, 0, 1]

print(f([3,1,2,2,3]))
# [3, 0, 1, 2, 4]

Note that nightcracker's function is faster:

def get_sorted_indices(l):
    sorted_positions = sorted(range(len(l)), key=l.__getitem__)
    result = [None for _ in range(len(l))]
    for new_index, old_index in enumerate(sorted_positions):
        result[old_index] = new_index
    return result

The difference may be significant for long lists:

In [83]: import random
In [98]: l = [random.randrange(100) for _ in range(10000)]
In [104]: timeit get_sorted_indices(l)
100 loops, best of 3: 4.73 ms per loop

In [105]: timeit f(l)
100 loops, best of 3: 6.64 ms per loop

This sorts twice, if speed is a factor my version _might_ be faster (on the other hand, the sorting code is coded in C, so that might actually make yours faster). — orlp, Sep 14 '12 at 00:31

orlp · Answer 2 · 2012-09-14T01:40:21.680

4

This is the best I came up with:

def get_sorted_indices(l):
    sorted_positions = sorted(range(len(l)), key=l.__getitem__)
    result = [None for _ in range(len(l))]

    for new_index, old_index in enumerate(sorted_positions):
        result[old_index] = new_index

    return result

It's faster than your solution, but that's about it.

edited Sep 14 '12 at 01:40

answered Sep 14 '12 at 00:15

orlp

112,504
36
218
315

+1: Indeed, your method is faster, especially for longer lists. (PS. I think `[None for _ in len(l)]` needs to be `[None for _ in range(len(l)]`. Also, the code is a bit faster if you change `key = lambda ...` to `key = l.__getitem__`. ) – unutbu Sep 14 '12 at 01:16
@unutbu: incorporated the changes you suggested. – orlp Sep 14 '12 at 01:40
Or just `result = [None] * len(l)` – Warren Weckesser Sep 14 '12 at 03:49
@Warren That's more succinct, but list multiplication is very often a bad idea. Even though it would be fine in this case, on a Q&A site where there's lots of new Pythonistas around I tend to think it's better not to demonstrate list multiplication, lest someone try to generalise it and end up contributing to the enormous mountain of Python SO questions to do with shared-references in Python. – Ben Sep 14 '12 at 04:30

score 2 · Answer 3 · edited May 23 '17 at 11:49

2

There's a one-line comprehension but it's really ugly:

>>> E, S = enumerate, sorted
>>> l = [3,1,2,2,3]
>>> [a for _,a in S((a,b) for b,(_,a) in E(S((a,b) for b,a in E(l))))]
[3, 0, 1, 2, 4]

Unutbu's answer is easier to read and generates less garbage.

edited May 23 '17 at 11:49

Community

1
1

answered Sep 14 '12 at 00:31

Gareth Rees

64,967
9
133
163

thikonom · Answer 4 · 2012-09-14T00:40:59.283

2

k = [3, 0, 1, 2, 4]
initial = dict(zip(k, range(len(k)))) #{0: 1, 1: 2, 2: 3, 3: 0, 4: 4}
sorted_initial = dict(zip(sorted(k), range(len(k)))) #{0: 0, 1: 1, 2: 2, 3: 3, 4: 4}
initial.update(sorted_initial) #{0: 0, 1: 1, 2: 2, 3: 3, 4: 4}
result = [initial[i] for i in k] #[3, 0, 1, 2, 4]

edited Sep 14 '12 at 00:40

answered Sep 14 '12 at 00:33

thikonom

4,219
3
26
30

2

The only difficulty with this approach is that the items in the sequence must be hashable. – senderle Sep 14 '12 at 00:46

Warren Weckesser · Answer 5 · 2012-09-14T03:38:55.683

If you are doing statistical calculations, you will probably start using numpy at some point. With numpy, you can use the existing implementation of argsort:

>>> from numpy import array
>>> x = array([3, 1, 2, 2, 3])
>>> x.argsort().argsort()
array([3, 0, 1, 2, 4])

That's a numpy version of unutbu's answer. nightcracker's answer can be implemented as

>>> from numpy import array, empty_like, arange
>>> x = array([3, 1, 2, 2, 3])
>>> s = x.argsort()
>>> r = empty_like(s)
>>> r[s] = arange(x.size)
>>> r
array([3, 0, 1, 2, 4])

Given a "jumbled" list L, get a list where each element is the index of the corresponding element of L, if L was sorted

Desired Output

My Attempt

5 Answers5