1

I am trying to find the quickest way to sort a list. For example, say i was trying to sort the following list

lst = [1, 0, -1, 0.1, 0, 5, 10, 4]

What i want at the end, is to have the sorted list, but to also be able to know what their indexes in lst were, before the sorting.

The method i am currently using is this

lst = [1, 0, -1, 0.1, 0, 5, 10, 4]
lst = list(enumerate(lst))
lst.sort(key = lambda x: x[1], reverse = True)

Doing so will give lst = [(6, 10), (5, 5), (7, 4), (0, 1), (3, 0.1), (1, 0), (4, 0), (2, -1)]

Now i do not necessarily need to have the tuple (idx, value), it can be two separate lists. The important part, is to sort the values, and also know what were the 'original' indexes in the list lst. So for example get:

lst_val = [10, 5, 4, 1, 0.1, 0, 0, -1]
lst_idx = [6, 5, 7, 0, 3, 1, 4, 2]

Now i was wondering if there was maybe a quicker/more efficient method to sort this way, because i could have a list with over 200,000 values inside it.

Using numpy is allowed, but besides that i don't think other modules are allowed.

needle
  • 188
  • 2
  • 8
  • Does this answer your question? [Transpose/Unzip Function (inverse of zip)?](https://stackoverflow.com/questions/19339/transpose-unzip-function-inverse-of-zip) – quamrana Mar 17 '21 at 18:30
  • @quamrana not really, i am trying to sort the list, and also keep track of the indexes, not trying to separate indexes from the list... – needle Mar 17 '21 at 18:36
  • The answer points out that `zip` is its own inverse. You have made a `list` which is as if you have zipped the values and their indexes together. Now you need to unzip them into two lists. – quamrana Mar 17 '21 at 18:38

2 Answers2

4

If you need a significant speed up you must use numpy

import numpy as np

np_lst = np.array(lst)

sorted_indices = np_lst.argsort() #array([2, 1, 4, 3, 0, 7, 5, 6])

Then, you can 'sort' the array this way:

np_lst[sorted_indices]
#array([-1. ,  0. ,  0. ,  0.1,  1. ,  4. ,  5. , 10. ])

You can also get it in reverse with:

np_lst[sorted_indices[::-1]] 
#array([10. ,  5. ,  4. ,  1. ,  0.1,  0. ,  0. , -1. ])
Pablo C
  • 4,661
  • 2
  • 8
  • 24
  • Say i have a list that has 200,000 values inside of it, would it still be quicker to use your method if i have to convert the list to a `numpy array`, by doing `lst = np.array(lst)`, and then sorting with `np.sort`? – needle Mar 17 '21 at 18:38
  • @needle im pretty sure the speed up is significant – Pablo C Mar 17 '21 at 18:42
  • do you think it would be even quicker if i create lst as a `numpy array`, and update the values to the numpy array instead? – needle Mar 17 '21 at 18:43
  • @needle I don't understand what do you mean with "create lst as a numpy array" – Pablo C Mar 17 '21 at 19:00
  • @needle you could create another question about that or edit this question – Pablo C Mar 17 '21 at 19:03
  • In reality, i start from an empty list `lst=[]`. Then i have to open a file, and i add the values to the list using `append` (it's a bit more complicated, but its the gist of what im doing). After that, i need to sort the values, and also know the indexes. And the question is, would it be quicker to keep doing like before, and when im about to sort, do `lst = np.array(lst)` and then use your sorting method. OR, would it be faster to immediately create the list as `lst = np.array([])`, and add the values to that, and then sort. – needle Mar 17 '21 at 19:08
  • 1
    @needle--from [Fastest way to grow a numpy numeric array](https://stackoverflow.com/questions/7133885/fastest-way-to-grow-a-numpy-numeric-array) it would be faster to grow the Python list then convert to numpy array before the sort rather than growing a numpy array. – DarrylG Mar 17 '21 at 19:28
  • 1
    @needle always is a bad idea using `numpy.append` iteratively. Try something like this instead: https://numpy.org/devdocs/reference/generated/numpy.loadtxt.html – Pablo C Mar 17 '21 at 19:36
0

Countn't you use lst_val = sorted(lst) then use .index() to find the index of the values. Something like this:

lst = [1, 0, -1, 0.1, 0, 5, 10, 4]
lst_val = sorted(lst)
lst_idx = []

for i in lst_val:
    lst_idx.append(lst.index(i))
Vicdenz
  • 27
  • 5