0

I have a list of points, where each point is d-dimensional. The list is currently sorted with the regular list.sort() function, which means the keys are prioritized from the first dimension to the last in order. I want to take the list and sort it while prioritizing the 2nd key, running a code on it, then prioritizing the 3nd key and so on. Is there a more efficient way than just using the sort() function with a specific key?

Because I know the points are sorted at each step with priority to the k-th dimension and I want to sort it by the next one, so I think it might be possible to manage the list and sort it just like combining a bunch of sorted lists in O(n) instead of O(nlogn) for the regular sort(). Is it possible, or am I mistaken? If possible, a help with implementation in python would be appreciated.

Edit:

Example of a list after the list.sort() function:

[(-3, -5, -5, -2), (-3, -4, 2, -2), (-3, 2, 5, 2), (0, 0, -5, 1), (1, -3, 2, 0), (2, -1, 0, 0), (2, 3, -1, 0), (4, 1, -2, 1), (5, -3, -1, 1), (5, 1, -2, -2)]
  • Does the dimensions have any intrinsic relationship (e.g., values in dimension 2 are always smaller than values in dimension 3)? Also, this code you run after sorting takes into account only the sorted dimension or it uses all the others dimensions too? – Leafar Aug 19 '19 at 11:32
  • can you post an example of your data point – kederrac Aug 19 '19 at 11:34
  • Maybe you should override the sorting function. Look at https://stackoverflow.com/questions/11850425/custom-python-list-sorting – user803422 Aug 19 '19 at 11:35
  • The dimensions don't have any intrinsic relationship other than being all floats and my code takes into account only the sorted dimension. – Patrick Nilexis Aug 19 '19 at 11:44
  • 3
    Can you give an example of the input, and your desired output? I can't tell if the example you've supplied is the input or the desired output. And your explanation is unclear. – Jim Mischel Aug 19 '19 at 17:02
  • Note that postulate (it should be possible to sort O(n) that way) is false. That would be implementing a fusion-partition sort, partition being made from kth keys, recursive sorting from k+1th, etc. So, it is still O(n.log(n)), and even worse, because of the non-optimal constraint on how we partition data. – chrslg Dec 30 '22 at 21:57

3 Answers3

0

you can try:

my_list.sort(key=lambda x: x[1:])
kederrac
  • 16,819
  • 6
  • 32
  • 55
  • I think this is what the regular sort() function does, just ignoring the 1st dimension. I will need to sort the list for every dimension in order, so just calling sort D times with different keys won't help in improving the efficiency as I wish to do. – Patrick Nilexis Aug 19 '19 at 12:12
  • please provide a better example than, like with 3 dimensions( that fit your data) – kederrac Aug 19 '19 at 12:16
  • add desired output also please – kederrac Aug 19 '19 at 12:17
0

Imagine you can sort the n elements k times, dimension by dimension, in less than O(k.n lgn), e.g O(n (lg n + k-1)) (sort the first dimension in O(n lg n) and then in O(n)).

Then take a list of m numbers, split it into sqrt(m) chunks of (roughly) sqrt(m) elements. You have k = n = sqrt(m). Sort the zipped (as in Python) chunks in O(k (lg n + k-1) = O(sqrt(m) (lg (sqrt(m)) + sqrt(m)-1)). Since sqrt wins over lg, your chunks are sorted in O(m). You have sqrt(m) sorted lists that you can merge in O(n lg k) = O(sqrt(m) lg (sqrt(m))) (https://en.wikipedia.org/wiki/K-way_merge_algorithm). This merge time is less than O(m). Putting the blocks together, your list is sorted in O(m).

The O (n lg n) boudnary is generally accepted, thus it is reasonable to say that you won't find a way to achieve the k sorts in less than O(k.n lgn).

jferard
  • 7,835
  • 2
  • 22
  • 35
-2
my_list = sorted(my_list, key=lambda x: x[1:])

Reference: https://docs.python.org/3/howto/sorting.html

  • 1
    OP made clear they already know about `key`, and they didn't want it. A 3 years old answer already proposed that anyway, and was already rebuked by OP. Please, elaborate why do you think your version is new and better than already existing answer? – chrslg Dec 30 '22 at 21:54